Finding decent switches is hard: most of the modern offerings seem to suffer from the current disease of “If it needs a web interface, put Linux on it”. That’s fine for something like a router, but having your switch take minutes to boot isn’t. I also detest switches where it’s easy to leave the switch in a state where the current running configuration is different from the configuration that will be used at the next boot – plunging your network into disarray at the next power cut.
D-Link used to sell an excellent set of Web/SNMP manageable switches which they called DGS1216/24/48T. They boot in seconds, and you can usually pick them up on eBay for about £30 – there are multiple versions of the hardware, this article deals with the D1/D2 revision of the 16 port version.
A few of the ones I bought recently had a peculiar pathology: after updating the firmware they would run fine (complete with working UI), and then about 100 seconds after power on, they would freeze and stop forwarding traffic. Given they’d worked fine before the firmware update I assumed it wasn’t a hardware problem. After I had 3 of them in this state, I decided to take a quick look and see if the problem was anything obvious, or if I could just downgrade the firmware with an EEPROM programmer. The first step was to get the lid off. Here’s a side lit (so you can read the chips) picture of the right hand side of the board:
The first thing to notice is that are two headers (which I’ve populated). J3, black and at the top-left, turns out to be a serial port:
A little bit of Googling suggests that the Marvel 88E6218 is an ARM9E based CPU, and sure enough, at 38400 baud, the serial port says:
Hit any key for boot setup ARMboot code: +_armboot_start 00380000 -> 00392180
At the moment the switch becomes sad and stops forwarding traffic the serial port says:
Prestera commander shell server ready
***************************************
->Port 0 is Link up
MAC illegal
Ordinarily the next step would be to start tearing apart the firmware image of the switch, but there’s that promising other 20 pin header. There’s a canonical 20 pin ARM JTAG header, and sure enough that’s what it is. I connected a SEGGER J-Link and after a bit of trial and error wrote a board configuration file mapping out the RAM and the ROM. I dumped the ROM with:
[lab@lab dgs1216t]$ openocd -f openocd-dgs1216t.cfg -c init -c halt -c 'dump_image rom.bin 0xffe00000 0x00200000' -c shutdown Open On-Chip Debugger 0.9.0 (2016-02-05-11:44) Licensed under GNU GPL v2 ... target halted in ARM state due to debug-request, current mode: Supervisor cpsr: 0x20000013 pc: 0x000c4958 dumped 2097152 bytes in 66.150436s (30.960 KiB/s) shutdown command invoked [lab@lab dgs1216t]$
and then searched for the string ‘MAC illegal’, but found nothing; it’s likely the bootloader is unpacking an image stored in the ROM. So instead I dumped a RAM image after the switch had booted. The location of the RAM in the memory map can be intuited by running strings(1) on the ROM image:
sdramImgAddr=0x00100000 board_cmd_sdramsize=0x800000
so
[lab@lab dgs1216t]$ openocd -f openocd-dgs1216t.cfg -c init -c halt -c 'dump_image ram.bin 0x0 0x00800000' -c shutdown Open On-Chip Debugger 0.9.0 (2016-02-05-11:44) Licensed under GNU GPL v2 ... Info : JTAG tap: dgs1216t.cpu tap/device found: 0x159463d3 (mfg: 0x1e9, part: 0x5946, ver: 0x1) Info : Embedded ICE version 5 Info : 88e6218: hardware has 2 breakpoint/watchpoint units dumped 8388608 bytes in 190.316849s (43.044 KiB/s) shutdown command invoked
And the RAM image does contain the magic string “MAC illegal”:
00121cb0 6d 65 72 28 29 2d 2d 43 61 6e 27 74 20 73 74 61 |mer()--Can't sta|
00121cc0 72 74 20 74 69 6d 65 72 0a 00 00 00 69 4c 6f 76 |rt timer....iLov|
00121cd0 45 44 6c 69 6e 4b 53 77 69 54 63 48 00 00 00 00 |EDlinKSwiTcH....|
00121ce0 4d 41 43 20 69 6c 6c 65 67 61 6c 0a 00 00 00 00 |MAC illegal.....|
00121cf0 74 72 75 6e 6b 74 6d 72 2e 63 2d 2d 53 74 61 72 |trunktmr.c--Star|
00121d00 74 5f 43 68 65 63 6b 5f 48 6d 61 63 5f 54 69 6d |t_Check_Hmac_Tim|
To find the bit of code that’s causing the trouble we search for the address, 0x121ce0, of the string. It occurs exactly once at 0x9ab4, and the dissassembly looks like:
0x9a90: cmp r3, #0 0x9a94: beq 0x9ab8 0x9a98: ldr r0, [pc, #20] ; 0x9ab4 0x9a9c: bl 0xc20b4 0x9aa0: bl 0x99b0 0x9aa4: mov r0, #2 0x9aa8: bl 0x93f4 0x9aac: b 0x9ab8 0x9ab0: .dword 0x121ccc 0x9ab4: .dword 0x121ce0 0x9ab8:
Now one of the nice things about ARM JTAG is that there’s also a full on-chip debugger. So gdb can set a breakpoint at 0x9a90 and see what happens if the CPU is forced to take the other branch in the test.
[lab@lab dgs1216t]$ arm-none-eabi-gdb GNU gdb (GDB) 7.6.2 [...] (gdb) target remote localhost:3333 Remote debugging using localhost:3333 0x00111f14 in ?? () (gdb) hbreak *(0x9a90) Hardware assisted breakpoint 1 at 0x9a90 (gdb) set $pc=0xffff0000 (gdb) cont Continuing. Breakpoint 1, 0x00009a90 in ?? () (gdb) p $r3 $1 = 15 (gdb) set $r3=0 (gdb) cont Continuing.
I’ve not added a reset method to the OpenOCD configuration file so a “poor man’s reset” of jumping to the start of the boot code in ROM is used, and since the boot-loader overwrites the code in RAM, hardware breakpoints have to be used.
After setting the register and taking the other branch in the test, the switch behaves normally. It would be possible to patch out this test in the ROM, but that’s going to require reverse engineering the unpacking code and even then the switch won’t work if the firmware is upgraded. A better solution requires looking more carefully at the function containing the test. I’m indebted to a friend who is much quicker at reading ARM assembly than I am, for helping me render it into the following:
void func_99d8 (void) { uint8_t L32[32]; int L36; uint8_t L44[8]; uint8_t L60[16]; uint8_t L76[16]; memcpy (L32, "iLovEDlinKSwiTcH", 17); L36 = 0; if (func_5eeec ()) return; func_1f234 (L44); memset (L60, 0, 16); func_5ed88 (L44, L32, L60); func_c192c (L76, 16); 0x9a74: sub r3, fp, #60 ; L60 0x9a78: sub r2, fp, #76 ; L76 0x9a7c: mov r0, r3 0x9a80: mov r1, r2 0x9a84: mov r2, #16 0x9a88: bl 0x1159bc ; memcmp 0x9a8c: mov r3, r0 0x9a90: cmp r3, #0 0x9a94: beq 0x9ab8 if (!memcmp (L60, L76, 16)) return; 0x9a98: ldr r0, [pc, #20] ; 0x9ab4 0x9a9c: bl 0xc20b4 ; printf printf ("MAC illegal\n"); func_99b0 (); func_93f4 (2); }
The next obvious question is what’s in the local variables when the CPU gets to the memcmp?
^C Program received signal SIGINT, Interrupt. 0x000c48dc in ?? () (gdn) delete 1 (gdb) hbreak *(0x9a88) Hardware assisted breakpoint 2 at 0x9a88 (gdb) set $pc=0xffff0000 (gdb) cont Continuing. Breakpoint 2, 0x00009a88 in ?? () (gdb) x/8bx $sp+76-44 ;L44 0x479828: 0x00 0x21 0x91 0xe3 0xc7 0xa3 0x47 0x00 (gdb) x/16bx $r0 ;L60 0x479818: 0x5c 0xc1 0xb5 0xe4 0x8e 0xbe 0x35 0x1a 0x479820: 0xd4 0xb1 0xe3 0x64 0x77 0x60 0xc9 0x5e (gdb) x/16bx $r1 ;L76 0x479808: 0x4f 0x38 0x50 0x49 0xb2 0x56 0x65 0xd4 0x479810: 0x22 0x27 0xde 0x86 0xda 0x5f 0xdb 0x84 (gdb)
L44 happens to be the MAC address of the switch. Looking at the whole function we can see that it takes the MAC address, does some operation with it and the key “iLovEDlinKSwiTcH” to generate a 16 byte value in L60, which is compared with something returned from func_c192c in L76 – but what?
When the switch boots it prints “Hit any key for boot setup”. If a key is hit it presents a bootloader prompt which has a “help” command:
Hit any key for boot setup ARMboot code: +_armboot_start 00380000 -> 00392180 ARMboot 1-> help ... printenv- print environment variables setenv - set environment variables saveenv - save environment variables to persistent storage ... reset - Perform RESET of the CPU ... ? - alias for 'help' ARMboot 1-> printenv bootdelay=5 ... board_cmd_macaddress=00:21:91:e3:c7:a3 board_cmd_md5=4f:38:50:49:b2:56:65:d4:22:27:de:86:da:5f:db:84 Environment size: 617/4092 bytes ARMboot 1->
That last variable matches L76, the second argument to the memcmp call, perhaps it can be set to match the first argument?
ARMboot 1-> setenv board_cmd_md5 5c:c1:b5:e4:8e:be:35:1a:d4:b1:e3:64:77:60:c9:5e ARMboot 1-> saveenv Erasing sector 1e size=1000 secoffset=1e0000 . ARMboot 1-> reset Hit any key for boot setup
Eventually we hit the gdb breakpoint at memcmp again:
Breakpoint 2, 0x00009a88 in ?? () (gdb) x/16bx $r0 ;l60 0x479818: 0x5c 0xc1 0xb5 0xe4 0x8e 0xbe 0x35 0x1a 0x479820: 0xd4 0xb1 0xe3 0x64 0x77 0x60 0xc9 0x5e (gdb) x/16bx $r1 ;l76 0x479818: 0x5c 0xc1 0xb5 0xe4 0x8e 0xbe 0x35 0x1a 0x479820: 0xd4 0xb1 0xe3 0x64 0x77 0x60 0xc9 0x5e (gdb)
Success! Finally we unplug all the debugging cables and see if the switch is now happy – it is:
Hello!
I have DES-2108 with same trouble and do not have JTAG.
Can you send me memory image or show func_5ed88 (L44, L32, L60) in assembler?
Hi
I have similar problem with DGS-1216T.
It used to run fine for years with the firmware v4.21.02.
Problem started when the switch was left powerless for few days and after power on, I could not get access to the management and the switch did not pass packets.
I reseted the switch and could get access to the Web-Mgmt, but it always dies ~50sec from boot and ~24sec after first ping response. It’s takes ~27 ping reply packets until Mgmt IP stops replying.
I can change settings etc, they are saved OK, if done before the Mgmt freezes. Also the reboot function from the Mgmt works and if done before the freeze it gives easier opportunity to do testing in multiple “reboot runs”. Sadly I have tried changing all settings and it wont help.
I also noticed the Mgmt MAC-address was changed
to 00-00-55-34-**-** and company name changed from D-Link Corporation to “COMMISSARIAT A L`ENERGIE ATOM.” Translates to “ATOM ENERGY COMMISSIONER” and raises questions…
Did you also experience that the MAC-address had changed (original at the bottom of the device)?
Any idea why the MAC-address would change, specially the vendor part?
Have you compared original working device’s ram’s, do they also have weird “iLovEDlinKSwiTcH” in them as well?
Exsample is there any possibility that there is attacker’s jumpcode on the boot and it is the cause of this? Possibly the exploit code gone bad that just accidently disables the management and was just ment to open a backdoor or cause a DOS?
Have you seen any signs that could be explained by a security breach/hacking or is it just firmware code gone really bad by itself?
Currently the switch does still work ”as a unmanaged dump switch”. With no VLAN etc. supported and computers that use Hyper-V have problems on the switch, only the hosts NIC’s traffic works (strange).
Is it wise to try to do firmware upgrade from the Mgmt web page using the same 4.21.02 swich already has? I fear it might brick because the Mgmt is only working about 24sec.
Is there something that I could still try to do without cables (what cables/adapters you recomend?) to the motherboard?
Example do you have fixed firmware to try?
Hi.
I have DGS-1216T and same problem as yours. MAC changed to 00-00-55-34,….. I have only serial console cable. After start of the main firmware it shows mac 00-00-55-34,… setting it in environment does nothing with it. Environment variables seems to be broken, printenv command shows md5 twice,.. I uploaded firmware file thru xterm to memory, then flashed it to 0xFFE00000 position and nothing chnaged. Itried to flash 1216Tldr.hex using smartconsole and it worked well but no change with main problem. So mac address is probably inside main cpu package in some small flash. Cant be probably accesible with serial. So JTAG may help but who knows,..
Hi.
It looks i got it 🙂 Dear James, big thanks to You. You are the best,..
1. I connect It with FTDI serial and hyperterminal 38000, 8 , 1.
2. I set board_cmd_macaddress and md5 to blank. There was some invisible characters, hyperterminal text capture helps,..
3. Then i set it to yours mac and md5. I know, my original mac is on backplate but dont know how to get md5 :).
4. I load 1612ldr.hex to 0x100000 using loadx command and hyperterminal send file,…
5. I use fburn command, BUT what i dont know before, there is important option ‘u’. It unlocks flash !! Without this option it flashes but nothing was writen to chip. So, fburn 0x100000 0xFFE00000 100000 u
6. I reset it by S2 switch near flash chip. Loader starts up and sees:
Hit any key for boot setup
## Starting application at 0xffe00000 …
#### WSS-II Driver Initialzation Done ####
System MacAddress not the same Board Config!!
Loader HW Rev: e1
00 21 91 e3 c7 a3
desc:790000 rx_buf:7a0000
flash alloc[300008]
#### Image Verify Fail, WSS-II Loader start ####
7. I used smart utility to flash DGS-1216T_D1D2_4_20_T14(1114105032).hex
AND finally happy switch returns to life,…
tftps_start
flash erase
flash erase
flash write
**
**
**
**
**
**
**
**
**
**
**
flash lock
cpu reset
Hit any key for boot setup
## Starting application at 0xffe00000 …
#### WSS-II Driver Initialzation Done ####
HW Rev: e1
System MacAddress not the same Board Config!!
**
**
changed to IVL mode.
DHCP_Start1
DHCP_Start4
00 21 91 e3 c7 a3
desc:790000 rx_buf:7a0000
#### WSS-II APP MAIN start ###
flash alloc[489e50]
WEB server running… [16634]
PSS LITE Ver – 1.06_08
Supported boards:
+———————————+————————-+
| Board name | Revisions |
+———————————
| 07 – D-DX270-24G-3XG | |
| | 01 – Rev 0.1 |
| | 02 – Rev 0.2 – DB-DX269-A|
+———————————+————————-+
| 11 – DB-DX246-24G | |
| | 01 – Rev 0.1 |
+———————————+————————-+
| 08 – RD-DX262-48G | |
| | 01 – Rev 0.1 |
+———————————+————————-+
Call gtInitSystem(index,boardRedId,localUnitNum,reloadEeprom), where:
index – The index of the system to be initialized.
boardRevId – The index of the board revision.
localUnitNum – The local unit number.
unit number is used as the first dev num in this box/board.
reloadEeprom – Whether the device’s eeprom should be reloaded after start-in
it.
***************************************
Prestera commander shell server ready
***************************************
->Port 0 is Link up
Hello
It’s been a lot of time since you’ve wrote that comment but…maybe you’re still here? Or is anybody here? I have D-link dgs-1248t with similar problem and I try to save it.