abclinuxu.cz AbcLinuxu.cz itbiz.cz ITBiz.cz HDmag.cz HDmag.cz abcprace.cz AbcPráce.cz
Inzerujte na AbcPráce.cz od 950 Kč
Rozšířené hledání
×
    dnes 19:55 | IT novinky

    Intel na veletrhu Computex 2024 představil (YouTube) mimo jiné procesory Lunar Lake a Xeon 6.

    Ladislav Hagara | Komentářů: 0
    dnes 13:44 | IT novinky

    Na blogu Raspberry Pi byl představen Raspberry Pi AI Kit určený vlastníkům Raspberry Pi 5, kteří na něm chtějí experimentovat se světem neuronových sítí, umělé inteligence a strojového učení. Jedná se o spolupráci se společností Hailo. Cena AI Kitu je 70 dolarů.

    Ladislav Hagara | Komentářů: 0
    dnes 13:22 | Nová verze

    Byla vydána nová verze 14.1 svobodného unixového operačního systému FreeBSD. Podrobný přehled novinek v poznámkách k vydání.

    Ladislav Hagara | Komentářů: 0
    dnes 12:55 | Zajímavý software

    Společnost Kaspersky vydala svůj bezplatný Virus Removal Tool (KVRT) také pro Linux.

    Ladislav Hagara | Komentářů: 3
    dnes 12:33 | Nová verze

    Grafický editor dokumentů LyX, založený na TeXu, byl vydán ve verzi 2.4.0 shrnující změny za šest let vývoje. Novinky zahrnují podporu Unicode jako výchozí, export do ePub či DocBook 5 a velké množství vylepšení uživatelského rozhraní a prvků editoru samotného (např. rovnic, tabulek, citací).

    Fluttershy, yay! | Komentářů: 0
    dnes 12:00 | Nová verze

    Byla vydána (𝕏) nová verze 7.0 LTS open source monitorovacího systému Zabbix (Wikipedie). Přehled novinek v oznámení na webu, v poznámkách k vydání a v aktualizované dokumentaci.

    Ladislav Hagara | Komentářů: 0
    dnes 11:11 | Nová verze

    Organizace Apache Software Foundation (ASF) vydala verzi 22 integrovaného vývojového prostředí a vývojové platformy napsané v Javě NetBeans (Wikipedie). Přehled novinek na GitHubu. Instalovat lze také ze Snapcraftu a Flathubu.

    Ladislav Hagara | Komentářů: 0
    včera 17:00 | IT novinky

    Společnost AMD na veletrhu Computex 2024 představila (YouTube) mimo jiné nové série procesorů pro desktopy AMD Ryzen 9000 a notebooky AMD Ryzen AI 300.

    Ladislav Hagara | Komentářů: 0
    včera 16:22 | Nová verze

    OpenCV (Open Source Computer Vision, Wikipedie), tj. open source multiplatformní knihovna pro zpracování obrazu a počítačové vidění, byla vydána ve verzi 4.10.0 . Přehled novinek v ChangeLogu. Vypíchnout lze Wayland backend pro Linux.

    Ladislav Hagara | Komentářů: 0
    včera 14:00 | Zajímavý software

    Národní superpočítačové centrum IT4Innovations s partnery projektu EVEREST vydalo sadu open source vývojových nástrojů EVEREST SDK pro jednodušší nasazení aplikací na heterogenních vysoce výkonných cloudových infrastrukturách, zejména pro prostředí nabízející akceleraci pomocí FPGA.

    Ladislav Hagara | Komentářů: 0
    Rozcestník

    ARM ITM trace PC sampling HOWTO for STLink and JLink/JTrace

    25.7.2020 00:59 | Přečteno: 1664× | programování | poslední úprava: 25.7.2020 08:35

    Magic incantations for GDB, openocd, JLink GDB server to enable ITM PC and exception sampling. ARM ITM trace is a feature of Cortex MCUs with CoreSight - it allows you to see what is going inside CPU and can act like profiler. Getting ITM to work with orbuculum. Text is in English so that it's googleable for lost souls wanting to turn it on and work around for some weird bugs. Target is STM32F4 family.

    This was tested on 3 boards, with STM32F407 and STM32F427 MCUs. If you have crystal oscillator speed different from 8 MHz, it might not work correctly. SWO speed should depend only on CPU core clock, however experimentally I found out it kind of doesn't or there is something weird going on with clock config. There will be note about this later.

    First, clone orbuculum's devel branch, it contains some GDB macros for ITM settings and also orbtop:

    git checkout https://github.com/orbcode/orbuculum
    cd orbuculum
    git checkout Devel
    make
    

    Notice file Support/gdbtrace.init, this contains lot of magic macros, we'll use it later.

    JLink/JTrace

    Note for this you need original JLink/JTrace, the cheap chinese clones won't work. Power cycle JLink first, just to make sure.

    First run JLink GDB server in one terminal, e.g.:

    JLinkGDBServerCLExe -select USB -device Cortex-M4 -endian little -if SWD -speed auto -ir -LocalhostOnly
    

    In GDB, connect to the server. This expects you have the Support dir from orbuculum in current directory. Magic incantations below. This is for CPU core clock of 168 MHz, look at the gdbtrace.init's comments to see what the arguments are. The monitor SWO EnableTargeT is part of JLink's GDB server, see JLink user guide.

    The selected SWO speed below is 2000000 baud. You may have to reset target first via monitor reset.

    The parameters below select PC sampling that shouldn't be too fast, otherwise you'd get a lot of ITM overflows. DWT POSTRESET setting is important in this.

    target extended-remote :2331
    source Support/gdbtrace.init
    monitor SWO EnableTarget 168000000 2000000 0xFF 0
    enableSTM32SWO 4
    prepareSWO 168000000 2000000 0 0
     
    dwtSamplePC 1
    dwtSyncTap 3
    dwtPostTap 1
    dwtPostInit 1
    dwtPostReset 15
    dwtCycEna 1
     
    ITMId 1
    ITMGTSFreq 3
    ITMTSPrescale 3
    ITMTXEna 1
    ITMSYNCEna 1
    ITMEna 1
     
    ITMTER 0 0xFFFFFFFF
    ITMTPR 0xFFFFFFFF
    continue
    

    Now you should see some output if you do nc localhost 2332 to some file swo_data (port belongs to JLink GDB server and should pump out SWO data).

    You can use pcsampl utility from these ITM tools. Let's try to parse the file you dumped from the port 2332, firmware.elf is the firmware running on your board:

    ./pcsampl -e firmware.elf swo_file 2>/dev/null
    

    If the data are correct, you should see some meaningful result like:

        % FUNCTION
    10.77 *SLEEP*
    29.76 qstr_find_strn
    13.71 gc_collect_end
     6.97 mp_map_lookup
     6.00 gc_mark_subtree
     5.37 gc_alloc
     4.71 mp_execute_bytecode
     3.63 sha256_Transform
     1.80 mp_obj_get_type
    

    You can also watch with orbtop which is part of orbuculum (look in ofiles directory). This will take data from JLink's SWO port 2332, show exceptions, max 15 lines. It's like top, it just shows time spent in functions instead:

    ./orbtop -E -e firmware.elf -v3 -c 15 -s localhost:2332
    

    Sample output if all goes well:

     25.51%     2712 bn_multiply_reduce_step
     12.18%     1295 frexpf
     12.18%     1295 bn_multiply_long
      7.79%      828 qstr_find_strn
      4.65%      495 display_loader
      3.11%      331 gc_mark_subtree
      2.91%      310 mp_map_lookup
      2.26%      241 gc_alloc
      2.12%      226 bn_multiply_reduce
      2.11%      225 mp_execute_bytecode
      1.72%      183 sha256_Transform
      1.52%      162 bn_subtract
      1.43%      152 bn_add
      1.37%      146 bn_is_less
      1.27%      135 bn_rshift
    -----------------
     82.13%     8736 of 10627 Samples
    
    
     Ex |   Count  |  MaxD | TotalTicks  |  AveTicks  |  minTicks  |  maxTicks 
    ----+----------+-------+-------------+------------+------------+------------
    
    [---H] Interval = 1018mS / 0 (~0 Ticks/mS)
    

    Note on crystal oscillator speed different than 8 MHz

    For some unknown reason the above GDB incantations will make CPU output 3 Mbaud SWO data instead of 2 Mbaud if oscillator speed onboard is 12 MHz (case of one tested board).

    Just adjust the EnableTarget line to this, so that JLink expects 3 Mbaud data:

    monitor SWO EnableTarget 168000000 3000000 0xFF 0
    

    Don't ask me why. I am just an engineer. I measured it with logic analyzer and kept guessing the baudrate until it fit. According to ARM docs, SWO clock/prescaler should mostly depend only on CPU core clock. However it evidently doesn't. There were some other hints in docs when it's not based on CPU core clock, but that was different case. Though there could be issue with the core clock config that seems only appear when enabling SWO data.

    STLink

    Success with STLink depends on STLink firmware version. Some work, some do not output SWO data, some randomly desync after you enable ITM trace.

    One workaround I had success with was using USB-UART adapter and connect it directly to SWO pin of the CPU instead of passing it through STLink.

    If you have the lucky version, run openocd -f interface/stlink-v2.cfg -f target/stm32f4x.cfg and put followin into GDB. It enables SWO trace via STLink's protocol command, then enables PC sampling and exception sampling. We scale down the speed of sampling to avoid too many ITM overflows.

    target extended-remote :3333
    source Support/gdbtrace.init
    monitor tpiu config internal swodump.log uart off 168000000 2000000
    monitor mmw 0xE0001000 69632 0
    monitor mmw 0xE0001000 103 510
    dwtPostReset 15
    set *0xe0001000=*0xe0001000 | 0x200
    continue
    

    It's a bit hairy mess, but should work (168 MHz core clock and 2 Mbaud SWO). If you are lucky and have the right STLink FW version, swodump.log file should appear in the directory where openocd was run. You can decode it as above with pcsampl.

    If the swodump.log file is empty, you have the bad STLink FW version. You have to workaround via USB-UART adapter on SWO pin.

    Set baudrate to 2 Mbaud, look at screen if it spews data. It's actually important to run screen as it seems to set some flags along with the stty. Might be issue with specific adapter.

    stty -F /dev/ttyUSB0 2000000
    screen /dev/ttyUSB0 2000000 # look if data is flowing, then kill screen
    stty -F /dev/ttyUSB0 2000000
    

    Now you can either dump data via cat from /dev/ttyUSB and decode with pcsampl or you can run orbuculum with orbtop (each in separate terminal).

    Both binaries below are in ofiles directory of orbuculum.

    ./orbuculum -p /dev/ttyUSB0 -a 2000000 -v2
    ./orbtop -E -e firmware.elf -v3 -p /dev/ttyUSB0 -a 2000000 -v2
    

    If succesful, you'll see the the orbtop output like above.

    Final remarks

    Many roosters were sacrificed to make this work. You may experience different behavior if you are on unlucky FW version of JTAG/SWD adapters.

    Logic analyzer and Pulseview/sigrok can help a lot to see whether you are getting data and whether it makes sense - screenshow below shows we are close, but there are framing errors with selected speed, so it's not precise:

    There are also some IDEs that could work with enabling this, like STM32 Cube IDE, however they have their own bugs. E.g. the Cube IDE definitely can't handle longer sampling because it tries to store everything into memory and in a very wasteful way at that. Even then it has problem with desync from adapter (this can be the FW issue) and depending on board will compute often wrong ITM settings, likely also the STLink FW issue. It is possible to generate KCachegrind output with orbuculum, but so far I haven't had much success with it.        

    Hodnocení: 80 %

            špatnédobré        

    Tiskni Sdílej: Linkuj Jaggni to Vybrali.sme.sk Google Del.icio.us Facebook

    Komentáře

    Vložit další komentář

    ISSN 1214-1267   www.czech-server.cz
    © 1999-2015 Nitemedia s. r. o. Všechna práva vyhrazena.