How I have fixed a lot of memory leaks in low-mem embedded system
Introduction
1. Using post-mortem analysis on low-mem board: Analysis of DDR RAM dumps
2. Fixing of compiler warnings
3. Static code analysis
4. Using HW mocking to debug memory leaks
Conclusion
Introduction
Let’s assume we have a tiny well-known commercially successful device. This device is Linux-based and is build on a tightly integrated ARM platform. Moreover, there is a bunch of memory leaks. The device has 32 megabytes of DDR RAM. On device start, there is a peak of memory consumption up to 30 MB with a following rollback to 24 MB. Yet, in a while we get crashed with OOM. So now I am going to describe a few trivial debugging techniques which helped me improve this situation significantly.
The following material is targeted at juniors, but some mature programmers may also find it useful.
As it came out, this time we had a really huge leak, so within the first iter I didn’t involve any complex and magical debugging. Instead, I repeatedly took some simple and trivial steps. They were:
1. Analysis of DDR RAM dumps
2. Fixing compiler warnings
3. Static code analysis
4. Running of Valgrind on a mocked application
Now, let’s go into details:
1. Using post-mortem analysis on a low-mem board: analysis of DDR RAM dumps
In order to perform debugging we have to put a memory dumping mechanism in a kernel OOM handler and pass it through to userspace. This is how it should be implemented: when OOM raises, we set an appropriate flag at a certain predefined memory address, flush caches and call the reset IRQ handler. At the first stage of reset handler performance we check a memory flag and, as if it is set, we put a copy of a register file and a DDR to a file on SD card. When copying is finished we get data suitable for T32, crash tool and other similar utilities. Now I will describe how we should deal with a crash tool:
1. Open ramdump in crash
2. Do “files” command
3. Look for repeatable patterns like:
... 19 c3675080 c3902d80 c3823000 REG /dev/ccinet0 20 c36cfb80 c3902d80 c3823000 REG /dev/ccinet0 21 c3ede600 c3902d80 c3823000 REG /dev/ccinet0 22 c36cf780 c3902d80 c3823000 REG /dev/ccinet0 ...
4. Grep sources for repeatable pattern
$ find -name “*.[chS]” -print0 | xargs -0 grep “/dev/ccinet0”5. Yes, somebody has forgotten to release a resource deep inside nested loops. Fix it.
6. Repeat steps
As a result of these manipulations gain about 1 MB of additional free memory. Moreover, we are not limited by “files” functionality of the crash. I have illustrated my investigations with “files” because it was the most effective way in my situation. Don’t hesitate to use all the power of this excellent tool.
2. Fixing of compiler warnings
All of us are developers so I will be short here:
This way I have saved an equal amount of free memory.
3. Static code analysis
I have used cppcheck:
1. cd /path/to/codebase/ 2. cppcheck \ -I"$BIONIC_INC" \ -I"$BIONIC_ARCH_INC" \ --enable=all -v --force . 2>&1 \ | tee log.cppcheck 3. vim log.cppcheck and have gone in a way similar to prev chapter.
The achieved results are similar to the ones discussed in previous chapters.
4. Using HW mocking to debug memory leaks
How about using our beloved Valgrind with low-mem systems? If we are a large company and have enough time we can assemble a special additional debug version of the board with enough amount of resources to run Valgrind. But what if we can’t? Or if we don’t have enough time? Or if we are a small company?
If we were smart enough we would already have, as I have illustrated in my previous article, a HAL and HW simulator. So we can run our SW in a simulator under Valgrind on a host that is powerful enough. With a HAL and HW simulator, which is appropriate and good enough, we will be able to fix all or almost all memory related issues.
But what if we are still learning, lazy or not smart enough? We can try to mock HW somehow, for example, with a mocking framework like GoogleMock. Yet, below I will illustrate this process using limited mocking functionality of GoogleTest, as it was sufficient in my situation.
So, we have an application implemented as an instance of PacketManager class:
$ cat Makefile .PHONY: all all: App1-gtest_run App1-gtest: $(Common_OBJECTS) \ ${CURDIR}/tests/PacketManager-test.cpp g++ -o $@ \ ${CURDIR}/gtest/gtest-all.cc \ ${CURDIR}/gtest/gtest_main.cc \ ${CURDIR}/tests/PacketManager-test.cpp \ $(App1_Static_LIBS) \ $(LIBS) .PHONY: App1-gtest_run check test App1-gtest_run check test: App1-gtest mkdir -p valgrind-gtest; \ valgrind -v --read-var-info=yes \ --leak-check=full \ --track-origins=no \ --log-file=valgrind-gtest/App1-gtest.`date +%F_%T`.valgrind \ ./App1-gtest .PHONY: check-strace check-strace: App1-gtest export DIR="strace-gtest/App1.$(shell date +%F_%T).trc"; \ mkdir -p "$${DIR}"; \ strace -ff -tt -s 1024 -o "$${DIR}/App1-gtest.trc" \ ./App1-gtest .PHONY: App1-gtest_run-all check-all test-all App1-gtest_run-all check-all test-all: App1-gtest mkdir -p valgrind-gtest; \ valgrind -v --read-var-info=yes \ --leak-check=full \ --track-origins=yes \ --show-reachable=yes \ --log-file=valgrind-gtest/App1-gtest_all.`date +%F_%T`.valgrind \ ./App1-gtest --gtest_also_run_disabled_tests .PHONY: check-strace-all check-strace-all: App1-gtest export DIR="strace-gtest/App1.$(shell date +%F_%T).trc"; \ mkdir -p "$${DIR}"; \ strace -ff -tt -s 1024 -o "$${DIR}/App1-gtest.trc" \ ./App1-gtest --gtest_also_run_disabled_tests .PHONY: cppcheck cppcheck: mkdir -p cppcheck; \ cppcheck --enable=all -v . 2>&1 | tee cppcheck/App1.`date +%F_%T`.cppcheck $ cat tests/PacketManager-test.cpp #include <cstdlib> #include <gtest/gtest.h> #include <UI/UItest.h> #include <PacketManager.h> namespace app1 { class PacketManagerTest: public ::testing::Test { protected: PacketManagerTest() { enum { argc = 2 }; char argv_0[] = "./App1"; char argv_1[] = "-DESTINATION=./tests/"; char* argv[argc] = { argv_0, argv_1 }; ui = new UItest(); ac = new AppConfig(argc, argv, ui); bs = new BinStream(); pm = new PacketManager(bs, ac, ui); } ~PacketManagerTest() { delete pm; pm = NULL; delete ac; ac = NULL; delete bs; bs = NULL; delete ui; ui = NULL; } PacketManager *pm; AppConfig *ac; BinStream *bs; UI *ui; enum { PcktInitState = PacketManager::PcktInitState , PcktTypeState = PacketManager::PcktTypeState , PcktLenState = PacketManager::PcktLenState , PcktEndState = PacketManager::PcktEndState , }; inline CircularBuffer* GetRingBuffer() const { return pm->ringBuffer; } inline Statistics* GetApp1Statistics() const { return &pm->app1Statistics; } inline int GetPcktState() const { return pm->pcktState; } inline void IterStatesPcktTable(Frame& frame) const { pm->IterStatesPcktTable(frame); } }; TEST_F(PacketManagerTest, Init_Read_EndPacket){ int i; /// It's real data captured from device some time before unsigned char const rxBuff[DATA_PACKET_SIZE] = { 'D', 'A', 'T', 'A', DATA_PACKET, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, //< CRC 0x00, 0x01, 0x02, 0x03, //< Timestamp 0x00, 0x00, 0x00, 0x00, //< Data ID 0x00, 0x00, 0x00, 0x01, //< header length (24 bytes) 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }; EXPECT_EQ(DATA_PACKET_SIZE, sizeof(rxBuff)); /// Here we deliver our data to application logic EXPECT_EQ(sizeof(rxBuff), pm->SavePacketReceived(rxBuff, sizeof(rxBuff))); EXPECT_EQ(sizeof(rxBuff), GetRingBuffer()->SpaceUsed()); /// Here we run main application class on our data Frame frame(GetApp1Statistics(), NULL, NULL); EXPECT_EQ(sizeof(rxBuff), frame.Read(GetRingBuffer())); EXPECT_EQ(0, GetRingBuffer()->SpaceUsed()); EXPECT_EQ(PcktInitState, GetPcktState()); EXPECT_FALSE(frame.is_processed()); IterStatesPcktTable(frame); EXPECT_EQ(4, frame.GetIndex()); EXPECT_EQ(PcktTypeState, GetPcktState()); EXPECT_FALSE(frame.is_processed()); IterStatesPcktTable(frame); EXPECT_EQ(5, frame.GetIndex()); EXPECT_EQ(PcktLenState, GetPcktState()); EXPECT_FALSE(frame.is_processed()); IterStatesPcktTable(frame); EXPECT_EQ(PACKET_HEADER_SIZE, frame.GetIndex()); EXPECT_EQ(PcktEndState, GetPcktState()); EXPECT_FALSE(frame.is_processed()); IterStatesPcktTable(frame); EXPECT_EQ(PACKET_HEADER_SIZE, frame.GetIndex()); EXPECT_EQ(PcktInitState, GetPcktState()); EXPECT_FALSE(frame.is_processed()); //< @warning WA for undefined reason //EXPECT_TRUE(frame.is_processed()); } } $ make ... Running main() from gtest_main.cc [==========] Running 1 tests from 1 test cases. [----------] Global test environment set-up. [----------] 1 test from PacketManagerTest [ RUN ] PacketManagerTest.Data_Packet [ OK ] PacketManagerTest.Data_Packet (1 ms) [----------] 1 test from PacketManagerTest (1 ms total) [----------] Global test environment tear-down [==========] 1 tests from 1 test cases ran. (1 ms total) [ PASSED ] 1 tests. $ cat valgrind-gtest/App1-gtest.2013-04-18_18:17:10.valgrind ==17112== Memcheck, a memory error detector ==17112== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. ==17112== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info ==17112== Command: ./App1-gtest ==17112== Parent PID: 17106 ==17112== --17112-- --17112-- Valgrind options: --17112-- --suppressions=/usr/lib/valgrind/debian-libc6-dbg.supp --17112-- -v --17112-- --read-var-info=yes --17112-- --leak-check=full --17112-- --track-origins=no --17112-- --log-file=valgrind-gtest/App1-gtest.2013-04-18_18:17:10.valgrind --17112-- Contents of /proc/version: --17112-- Linux version 3.2.0-40-generic (buildd@allspice) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #64-Ubuntu SMP Mon Mar 25 21:22:10 UTC 2013 --17112-- Arch and hwcaps: AMD64, amd64-sse3-cx16 --17112-- Page sizes: currently 4096, max supported 4096 --17112-- Valgrind library directory: /usr/lib/valgrind --17112-- Reading syms from /home/vitaly/Projects/App1/App1/App1-gtest (0x400000) parse_type_DIE: confused by: <4><32fc36>: DW_TAG_structure_type DW_AT_declaration : 1 --17112-- WARNING: Serious error when reading debug info --17112-- When reading debug info from /home/vitaly/Projects/App1/App1/App1-gtest: --17112-- parse_type_DIE: confused by the above DIE --17112-- Reading syms from /lib/x86_64-linux-gnu/ld-2.15.so (0xd181000) --17112-- Considering /lib/x86_64-linux-gnu/ld-2.15.so .. --17112-- .. CRC mismatch (computed eabdc7b7 wanted 3ee54b4e) --17112-- Considering /usr/lib/debug/lib/x86_64-linux-gnu/ld-2.15.so .. --17112-- .. CRC is valid --17112-- Reading syms from /usr/lib/valgrind/memcheck-amd64-linux (0x38000000) --17112-- Considering /usr/lib/valgrind/memcheck-amd64-linux .. --17112-- .. CRC mismatch (computed fd32bc40 wanted 1b1bde8c) --17112-- object doesn't have a symbol table --17112-- object doesn't have a dynamic symbol table --17112-- Reading suppressions file: /usr/lib/valgrind/debian-libc6-dbg.supp --17112-- Reading suppressions file: /usr/lib/valgrind/default.supp ==17112== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-17112-by-vitaly-on-??? ==17112== embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-17112-by-vitaly-on-??? ==17112== embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-17112-by-vitaly-on-??? ==17112== ==17112== TO CONTROL THIS PROCESS USING vgdb (which you probably ==17112== don't want to do, unless you know exactly what you're doing, ==17112== or are doing some strange experiment): ==17112== /usr/lib/valgrind/../../bin/vgdb --pid=17112 ...command... ==17112== ==17112== TO DEBUG THIS PROCESS USING GDB: start GDB like this ==17112== /path/to/gdb ./App1-gtest ==17112== and then give GDB the following command ==17112== target remote | /usr/lib/valgrind/../../bin/vgdb --pid=17112 ==17112== --pid is optional if only one valgrind process is running ==17112== --17112-- REDIR: 0xd1999e0 (strlen) redirected to 0x380625c7 (???) --17112-- Reading syms from /usr/lib/valgrind/vgpreload_core-amd64-linux.so (0xdba6000) --17112-- Considering /usr/lib/valgrind/vgpreload_core-amd64-linux.so .. --17112-- .. CRC mismatch (computed 8f05ac98 wanted c6d0ab1a) --17112-- object doesn't have a symbol table --17112-- Reading syms from /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so (0xdda8000) --17112-- Considering /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so .. --17112-- .. CRC mismatch (computed fc059af4 wanted ffc668e8) --17112-- object doesn't have a symbol table --17112-- REDIR: 0xd199850 (index) redirected to 0xddacc60 (index) --17112-- REDIR: 0xd1998d0 (strcmp) redirected to 0xddadc20 (strcmp) --17112-- Reading syms from /lib/x86_64-linux-gnu/libpthread-2.15.so (0xdfb3000) --17112-- Considering /lib/x86_64-linux-gnu/libpthread-2.15.so .. --17112-- .. CRC mismatch (computed 8e430e5b wanted e619a829) --17112-- Considering /usr/lib/debug/lib/x86_64-linux-gnu/libpthread-2.15.so .. --17112-- .. CRC is valid --17112-- Reading syms from /lib/x86_64-linux-gnu/libc-2.15.so (0xe1d0000) --17112-- Considering /lib/x86_64-linux-gnu/libc-2.15.so .. --17112-- .. CRC mismatch (computed 3af7ebbf wanted 50fc58fa) --17112-- Considering /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.15.so .. --17112-- .. CRC is valid --17112-- warning: addVar: unknown size (to_buf) --17112-- warning: addVar: unknown size (p) --17112-- warning: addVar: unknown size (info) --17112-- REDIR: 0xe25ce30 (strcasecmp) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe2591d0 (strnlen) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe25f100 (strncasecmp) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe25abc0 (__GI_strrchr) redirected to 0xddaca80 (__GI_strrchr) --17112-- REDIR: 0xe252f40 (malloc) redirected to 0xddac660 (malloc) --17112-- REDIR: 0xe25bc10 (memset) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe25bc50 (__GI_memset) redirected to 0xddaf080 (memset) --17112-- REDIR: 0xe2617e0 (memcpy@@GLIBC_2.14) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe317f90 (__memcpy_ssse3_back) redirected to 0xddadf30 (memcpy@@GLIBC_2.14) --17112-- REDIR: 0xe253580 (free) redirected to 0xddab7c0 (free) --17112-- REDIR: 0xe25b5f0 (bcmp) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe32d500 (__memcmp_sse4_1) redirected to 0xddaeca0 (bcmp) --17112-- REDIR: 0xe25bbc0 (memcpy@GLIBC_2.2.5) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe31d530 (__memmove_ssse3_back) redirected to 0xddadd20 (memcpy@GLIBC_2.2.5) --17112-- REDIR: 0xe2590f0 (__GI_strlen) redirected to 0xddacfc0 (__GI_strlen) --17112-- REDIR: 0xe2575b0 (strcmp) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe301e10 (__strcmp_sse42) redirected to 0xddadb60 (strcmp) --17112-- REDIR: 0xe2542a0 (calloc) redirected to 0xddaacf0 (calloc) --17112-- REDIR: 0xe2590a0 (strlen) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe332bd0 (__strlen_sse2_pminub) redirected to 0xddacfa0 (strlen) --17112-- REDIR: 0xe2592f0 (__GI_strncmp) redirected to 0xddad480 (__GI_strncmp) --17112-- REDIR: 0xe2574f0 (index) redirected to 0xdba6610 (_vgnU_ifunc_wrapper) --17112-- REDIR: 0xe301d60 (__strchr_sse42) redirected to 0xddacb20 (index) --17112-- REDIR: 0xe262d00 (strchrnul) redirected to 0xddaf3b0 (strchrnul) --17112-- REDIR: 0xffffffffff600000 (???) redirected to 0x380625b3 (???) --17112-- REDIR: 0xe3043b0 (__strcasecmp_sse42) redirected to 0xddad500 (strcasecmp) --17112-- REDIR: 0xe262af0 (__GI___rawmemchr) redirected to 0xddaf400 (__GI___rawmemchr) --17112-- REDIR: 0xe25b2a0 (memchr) redirected to 0xddadce0 (memchr) --17112-- REDIR: 0xe258a80 (__GI_strcpy) redirected to 0xddad0b0 (__GI_strcpy) --17112-- REDIR: 0xe2575f0 (__GI_strcmp) redirected to 0xddadbc0 (__GI_strcmp) ==17112== ==17112== HEAP SUMMARY: ==17112== in use at exit: 7,848 bytes in 95 blocks ==17112== total heap usage: 1,896 allocs, 1,801 frees, 87,643,726 bytes allocated ==17112== ==17112== Searching for pointers to 95 not-freed blocks ==17112== Checked 209,931,568 bytes ==17112== ==17112== 38 bytes in 1 blocks are possibly lost in loss record 8 of 61 ==17112== at 0xDDAC6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==17112== by 0x64CDBC: operator new(unsigned long) (in /home/vitaly/Projects/App1/App1/App1-gtest) ... ==17112== by 0x4115A1: app1::protobuf_AddDesc_app1lite_2eproto() (app1lite.pb.cc:305) ==17112== by 0x406998: _GLOBAL__sub_I__ZN3app135protobuf_AssignDesc_app1lite_2eprotoEv (app1lite.pb.cc:326) ==17112== by 0x65E2CC: __libc_csu_init (in /home/vitaly/Projects/App1/App1/App1-gtest) ==17112== by 0xE1F16FF: (below main) (libc-start.c:185) ==17112== ... ==17112== ==17112== LEAK SUMMARY: ==17112== definitely lost: 384 bytes in 4 blocks ==17112== indirectly lost: 0 bytes in 0 blocks ==17112== possibly lost: 1,496 bytes in 24 blocks ==17112== still reachable: 5,968 bytes in 67 blocks ==17112== suppressed: 0 bytes in 0 blocks ==17112== Reachable blocks (those to which a pointer was found) are not shown. ==17112== To see them, rerun with: --leak-check=full --show-reachable=yes ==17112== ==17112== ERROR SUMMARY: 8 errors from 8 contexts (suppressed: 2 from 2) --17112-- --17112-- used_suppression: 2 dl-hack3-cond-1 ==17112== ==17112== ERROR SUMMARY: 8 errors from 8 contexts (suppressed: 2 from 2) $
Conclusion
Sometimes trivial use of simple tools can help us achieve impressive results with little efforts. Sure, some complex and non-trivial leaks were left, but we can ignore them for now.
P.S. If you have to process this kind of bugfixing often, and if you would like to optimize it, you will find useful my next article.
56 коментарів
Підписатись на коментаріВідписатись від коментарів Коментарі можуть залишати тільки користувачі з підтвердженими акаунтами.