Purpose:
This article is introducing how to decode EDAC messages to understand which memory device got troubles.
Target Audience:
All users which is using Advantech server grade products and the Linux based kernel does support edac EDAC functions.
Use Cases:
1. Search the string "CPU_SrcID#x_MC#x_Chan#x_DIMM#x" first.
2. Here is the conversion table:
SrcID#x: x=CPU location (x=0=1st CPU=CPU0)
MC#x : x = memory controller (x=0=1st controller)
*Please contact Advantech representative to understand diagram of memory controller of that processors
Chan#x: x= Channel no. (x=0=1st Channel)
DIMM#x: x= DIMM location (x=0=1st DIMM)
3. As example in below,
ADV-node-2 kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#0_MC#1_Chan#0_DIMM#0 (channel:0 slot:0 page:0x1d51264 offset:0xc00 grain:32 syndrome:0x0 - err_code:0x0080:0x0090 SystemAddress:0x1d51264c00 ProcessorSocketId:0x0 MemoryControllerId:0x1 ChannelAddress:0x734499200 ChannelId:0x0 RankAddress:0x734499200 PhysicalRankId:0x0 DimmSlotId:0x0 Row:0x3aa26 Column:0x240 Bank:0x3 BankGroup:0x0 ChipSelect:0x0 ChipId:0x0)
CPU_SrcID#0 == CPU0
MC1 == 2nd Controller for Channel C or D
Chan#0 == 1st Channel, so now we know it's about channel C
DIMM#0 == 1st DIMM of that channel
Summary = Memory from "CPU0 Channel C1" is now suspected
Comments
0 comments
Please sign in to leave a comment.