The Godfather talking
You think I am funny guy huh?
Sonsivri
 
*
Welcome, Guest. Please login or register.
Did you miss your activation email?
December 09, 2024, 10:17:02 10:17


Login with username, password and session length


Pages: [1]
Print
Author Topic: Unrealiable STM32 timings problem  (Read 1842 times)
0 Members and 1 Guest are viewing this topic.
LithiumOverdosE
Senior Member
****
Offline Offline

Posts: 361

Thank You
-Given: 383
-Receive: 588


« on: October 15, 2024, 10:56:11 22:56 »

I'm having really frustrating problems with generating precise timings/delays in the project I am working on.
The target device is STM32L431RCT6 on a fully functional custom board.
Clock is derived from 20MHz external crystal with 22pF capacitors so HSE is used as a PLL source and SYSCLK is 80MHz, APB2 timer clock is 10MHz and APB2 peripheral clock is 5MHz.
I used STMCubeIDE 1.16.1 and majority of configuration was made in MX.

I'm trying to achieve pulses of very precise duration from 1us to 15us on pins in order to drive three separate switching bridges and I have to regulate pulse width in 0.5us steps or less (not necessarily a round number).

Normally I would use some dsPIC or even 18F but due to their pretty much determinable clock/cycle duration behaviour but in this case I'm stuck with STM32 which I don't normally use and don't have much experience with it.


Obviously my first attempts were to use timer interrupts but unlike dsPIC/PIC their behaviour seems to be somewhat unpredictable (BTW I was unplesantly surprised when I found out that jumps to simple functions take 300 ns which is terrible compared to dsPIC/PIC).


According to experiments in HW so far the only reasonable functional approach is by using DWT as in this code example in Main():

Code:
while (1)

  {

  //if(start) {

  if(treatment_time<duration) {

  if(TIM1->CNT>=repetition_rate) {

   treatment_time+=(float)repetition_rate/1000;

   if(mode==1) {     //1 channel, without pause

    __disable_irq();

    TIM1->CNT=0;

    start_time=0;

    ch1_A_on();

    start = DWT->CYCCNT;

    ticks = pulse_width * (HAL_RCC_GetSysClockFreq() / 1000000); 

    while ((DWT->CYCCNT - start) < ticks);

    ch1_off(); 

   

 ch1_B_on();

    start = DWT->CYCCNT;

    ticks = pulse_width * (HAL_RCC_GetSysClockFreq() / 1000000); 

    while ((DWT->CYCCNT - start) < ticks);

    ch1_off();

    __enable_irq();

    } else if (mode==2) {       

     TIM1->CNT=0;

     start_time=0;

     ch1_A_on();

     start = DWT->CYCCNT;

     ticks = pulse_width * (HAL_RCC_GetSysClockFreq() / 1000000); 

     while ((DWT->CYCCNT - start) < ticks);

     //ch1_off();

     //delay_us(pulse_width);

     ch1_off();

     start = DWT->CYCCNT;

     ticks = pulse_width * (HAL_RCC_GetSysClockFreq() / 1000000); 

     while ((DWT->CYCCNT - start) < ticks);

     ch1_B_on();

     start = DWT->CYCCNT;

     ticks = pulse_width * (HAL_RCC_GetSysClockFreq() / 1000000); 

     while ((DWT->CYCCNT - start) < ticks);

     ch1_off();

    } 

   }

  }

 //}



In the code variable mode is always 1 and ch1_A_on(), ch1_B_on() and ch1_off are just bit manipulation functions like this:

Code:
void ch1_A_on(void){

 GPIOB->ODR |= GPIO_PIN_15; //A1=HIGH

 GPIOA->ODR |= GPIO_PIN_8;  //A2=HIGH

 GPIOA->ODR &= ~GPIO_PIN_9; //B1=LOW

 GPIOB->ODR &= ~GPIO_PIN_13; //B2=LOW

}

void ch1_B_on(void){

 GPIOA->ODR |= GPIO_PIN_9;  //B1=HIGH

 GPIOB->ODR |= GPIO_PIN_13;  //B2=HIGH

 GPIOB->ODR &= ~GPIO_PIN_15;  //A1=LOW

 GPIOA->ODR &= ~GPIO_PIN_8;  //A2=LOW

}

void ch1_off(void){

 GPIOB->ODR &= ~GPIO_PIN_15;  //A1=LOW

 GPIOA->ODR &= ~GPIO_PIN_8;  //A2=LOW

 GPIOA->ODR &= ~GPIO_PIN_9;  //B1=LOW

 GPIOB->ODR &= ~GPIO_PIN_13;  //B2=LOW

}


The logic analyzer measurements (also verified with oscilloscope) gives me the following:

For 5us pulse width I get 5.792us for A1 and 6.0us for B1 pulse
For 8us pulsewidth  I get 8.792us for A1 and 9.0us for B1 pulse
For 15us pulsewidth  I get 15.708us for A1 and 15.917us for B1 pulse

I know that function calls ch1_A_on(), ch1_B_on() and ch1_off() also take some time (around 300 ns!).

What is also perplexing to me is that my A and B outputs are generated in the same way but they don't produce exactely same pulse width, even when I generate only a single iteration of it.


So my question is if there is some other approach to the problem of precise timings in STM32 and regulation of pulse duration in smaller steps (for example <= 500 ns)?
Logged
digitalmg
Junior Member
**
Offline Offline

Posts: 97

Thank You
-Given: 139
-Receive: 110


« Reply #1 on: October 16, 2024, 08:25:27 08:25 »

Hi,
In the GPIO configuration of the CubeMX output pins, you must set Maximum output speed: Very High,
when you want to get small switching times of the pins.

Logged
dennis78
Active Member
***
Offline Offline

Posts: 122

Thank You
-Given: 272
-Receive: 154


« Reply #2 on: October 16, 2024, 08:44:17 08:44 »

Have you thought about one-pulse mode of timer? Of course, if it can fit into rest of concept your app.
« Last Edit: October 16, 2024, 08:47:00 08:47 by dennis78 » Logged
UncleBog
Active Member
***
Offline Offline

Posts: 133

Thank You
-Given: 165
-Receive: 176


« Reply #3 on: October 16, 2024, 09:03:02 09:03 »

Your software controlled approach will be subject to the usual program timings such as clock rate, interrupt and branch overhead and optimisation level. You should be able to achieve timing to your clock resolution by setting up a timer and some comparators that are configured to switch GPIO directly.
Logged
sam_des
Senior Member
****
Offline Offline

Posts: 256

Thank You
-Given: 128
-Receive: 151


« Reply #4 on: October 16, 2024, 01:22:37 13:22 »

Hi,

Why not use the hardware PWM modes for pulse generation ? You can set the frequency once & adjust duty cycles as you require.
Minimal software intervention & precise pulse widths.
You can route the PLLed SYSCLK to timer as its clock. Check the clocks config page in CubeMX .
AFAIK, you can also combine 2 16-BIT timers to form a 32-BIT timer, giving you more resolution.
CubeMX will also most probably enable High Speed Output drivers on PWM pins to reduce Rise/Fall times.

sam_des
Logged

Never be afraid to do something new. Remember Amateurs built the Ark, Professionals built the Titanic !
LithiumOverdosE
Senior Member
****
Offline Offline

Posts: 361

Thank You
-Given: 383
-Receive: 588


« Reply #5 on: October 16, 2024, 08:35:27 20:35 »

Hi,
In the GPIO configuration of the CubeMX output pins, you must set Maximum output speed: Very High,
when you want to get small switching times of the pins.

That's the first thing I did.




Have you thought about one-pulse mode of timer? Of course, if it can fit into rest of concept your app.

I did. The problem is that this particular processor doesn't seem to support it.



Your software controlled approach will be subject to the usual program timings such as clock rate, interrupt and branch overhead and optimisation level. You should be able to achieve timing to your clock resolution by setting up a timer and some comparators that are configured to switch GPIO directly.

Yes that's the first thing I tried but it seems that there is a lag somewhere else towards output pin registers.





Hi,

Why not use the hardware PWM modes for pulse generation ? You can set the frequency once & adjust duty cycles as you require.
Minimal software intervention & precise pulse widths.
You can route the PLLed SYSCLK to timer as its clock. Check the clocks config page in CubeMX .
AFAIK, you can also combine 2 16-BIT timers to form a 32-BIT timer, giving you more resolution.
CubeMX will also most probably enable High Speed Output drivers on PWM pins to reduce Rise/Fall times.

sam_des

HW PWM is not an option because I have to drive 3 bridges and there is only a single PWM which I use in different part of circuit for bucking converter.
Combining 2 timers might be a good idea I have to check that out.

Posted on: October 16, 2024, 08:22:45 20:22 - Automerged

In the meantime, this code seems to work somewhat better when I address directly the BSSR registers.
There is less variation between pulses but still in the range of a few hundred ns (the adjustment resolution is better though).

Code:
while(1) {
        __asm volatile (
            "LDR R0, =0x48000418 \n"     // Load address of GPIOB->BSRR
            "MOVS R1, #0x8000 \n"         // Set bit for PB15 (0x8000)
            "STR R1, [R0] \n"                  // Set PB15
            "MOVS R1, #0x80000000 \n" // Reset bit for PB15 (0x80000000)
            "STR R1, [R0] \n"                  // Reset PB15
        );
}


I also tried running it from RAM with pretty much the same results.

Code:
attribute((section(".data"))) void Toggle_Pins_RAM(void) {
    while(1) {
        GPIOB->BSRR = GPIO_BSRR_BS15;  // Set PB15
        GPIOB->BSRR = GPIO_BSRR_BR15;  // Reset PB15
    }



What I failed to mention is that because I'm driving 3 full bridges I have to address 12 pins.
That said, the problem appears also when running just 2 pins for tests.


I got additional advice to turn on cache and prefetch but I'm yet to try this (I'm a bit sceptical but will see what happens).
Logged
LithiumOverdosE
Senior Member
****
Offline Offline

Posts: 361

Thank You
-Given: 383
-Receive: 588


« Reply #6 on: October 19, 2024, 12:17:00 00:17 »

Just an update if someone runs into simillar problem.

I switched from general purpose timers to Systick timer and it improved the situation significantly.
Every few impulses some of them are shorter for ~45 ns but for my purposes it is good enough.

Finally, my conclusion is that ARM architecture may be sophisticated but for reliable precise timings I would always choose some dsPIC or simillar "classical" microcontroller.
Logged
alien
Junior Member
**
Offline Offline

Posts: 56

Thank You
-Given: 30
-Receive: 7


« Reply #7 on: October 19, 2024, 09:25:26 09:25 »


Finally, my conclusion is that ARM architecture may be sophisticated but for reliable precise timings I would always choose some dsPIC or simillar "classical" microcontroller.

Yes and exactly for this reason I still choose general purpose modern microcontrollers ( like Microchip Midrange/enhanced Mid range  and even some single clocker 8051s) for low to medium complexity projects and they delivers well what is expected of them . The "predictability" aspect is really very simple in these MCUs as you are very close to the guts of the MCUs ...
Logged
bobcat1
Senior Member
****
Offline Offline

Posts: 306

Thank You
-Given: 4292
-Receive: 96


« Reply #8 on: October 20, 2024, 09:44:37 09:44 »

Hi
Your approach to writing software is wrong.
for generating precise timing should involved using hardware timers - not general software loops or diving to assembly language.
The STM32 timer are sophisticated hardware, some time hard to configure - but if you read the ST timer book you will be able to do it.
if the timer resolution is not enough for you you can choose STM32 with high resolution timer like the STM32G4xx who can go down to nsec resolution.     
I have done a similar project using the hardware timers to produce a full bridge control  - and it work very well using 2 timers only.
in the G series you have 8 timers (more then enough.....)

The DsPIC is far more degraded microcontroller in compare to STM32 where all high end software(timing , DSP ....) must be written in assembly language, language I abandoned long time ago when I switch from 8051 to MSP430. 


All the best

Bobi
Logged
LithiumOverdosE
Senior Member
****
Offline Offline

Posts: 361

Thank You
-Given: 383
-Receive: 588


« Reply #9 on: October 23, 2024, 09:02:38 09:02 »

Hi
Your approach to writing software is wrong.
for generating precise timing should involved using hardware timers - not general software loops or diving to assembly language.
The STM32 timer are sophisticated hardware, some time hard to configure - but if you read the ST timer book you will be able to do it.
if the timer resolution is not enough for you you can choose STM32 with high resolution timer like the STM32G4xx who can go down to nsec resolution.     
I have done a similar project using the hardware timers to produce a full bridge control  - and it work very well using 2 timers only.
in the G series you have 8 timers (more then enough.....)

The DsPIC is far more degraded microcontroller in compare to STM32 where all high end software(timing , DSP ....) must be written in assembly language, language I abandoned long time ago when I switch from 8051 to MSP430. 


All the best

Bobi


I agree with you regarding my approach being messy.
However, as I have stated earlier I'm stuck with this particular STM32 (not my choice) and there's nothing I can do to change it at this point when the darn thing is already integrated with rest of hardware.
Considering time constraints for this project I found the approach with the loop to be working satisfactorily for my purposes because I simply don't have too much time to spend on debugging/troubleshooting (too) complex hardware.

Of course I tried using timers/interrupts in several ways but there was always some problem regarding consistency of the timings.
Judging from my searches on the net it seems that I'm not the only one who have encountered similar problems with timing precision/consistency with STM32.

That said, I disagree with you regarding the dsPIC.
While it is true that dsPIC is less sophisticated, for the same reason it has more predictable behaviour.
Also, if extreme precision is not required I usually use pure C for work with timers/interrupts and I'm not quite sure why you would think that assembler is equired to do so. 

Too much sophistication inevitably leads to more complex debugging/troubleshooting process because any internal section in the chain can cause latency and unpredictability in behaviour.

So, while ARMs in general provide more sophistication, in some cases (such as this one) they also add quite a bit to the development time and make debugging more complex.
In this case, I simply don't need its sophistication so it leads to unnecessary addtional time spent on banal things such as timings on output pins.
Personally I tend to use the simplest possible microcontrollers for particular projects exactly for such reasons and in this case with dsPIC/PIC24 I would already have it up and running properly with timers/interrupts. 


 
Logged
wild
Active Member
***
Offline Offline

Posts: 180

Thank You
-Given: 612
-Receive: 452



« Reply #10 on: October 24, 2024, 01:15:53 01:15 »

your approach was MORE than messy: it was sloppy!
1) Why on earth did you set the timer/peripherials clock to 10/5 MHz? Set it to 80MHz and your jitter problems will almost disappear.
2) document yourself about TIM1 or TIM2: they are able to do in a perfect way what you need!
RTFMs, RTFMs, RTFMs!!!!!

Reference manual
Programming manual
Errata
AMBA
M4 Tech Ref
ARM M4 Tech Ref



BTW, as a compensation, you can use the documentation for the STM32L443RC instead of the STM32L431RC because all of the following chips mount the same die (YES, INSIDE they are the SAME):
Code:
DIE435  STM32L431C(B-C)Tx
DIE435  STM32L431C(B-C)Ux
DIE435  STM32L431C(B-C)Yx
DIE435  STM32L431K(B-C)Ux
DIE435  STM32L431R(B-C)Ix
DIE435  STM32L431R(B-C)Tx
DIE435  STM32L431R(B-C)Yx
DIE435  STM32L431VCIx
DIE435  STM32L431VCTx
DIE435  STM32L432K(B-C)Ux
DIE435  STM32L433C(B-C)Tx
DIE435  STM32L433C(B-C)Ux
DIE435  STM32L433C(B-C)Yx
DIE435  STM32L433R(B-C)Ix
DIE435  STM32L433R(B-C)Tx
DIE435  STM32L433R(B-C)Yx
DIE435  STM32L433RCTxP
DIE435  STM32L433VCIx
DIE435  STM32L433VCTx
DIE435  STM32L442KCUx
DIE435  STM32L443CCFx
DIE435  STM32L443CCTx
DIE435  STM32L443CCUx
DIE435  STM32L443CCYx
DIE435  STM32L443RCIx
DIE435  STM32L443RCTx
DIE435  STM32L443RCYx
DIE435  STM32L443VCIx
DIE435  STM32L443VCTx
Logged
LithiumOverdosE
Senior Member
****
Offline Offline

Posts: 361

Thank You
-Given: 383
-Receive: 588


« Reply #11 on: November 04, 2024, 09:30:48 21:30 »

Please calm down.

The code examples I posted are just examples and of course I used higher speeds as well.


I did RTFM and tried different approaches including use of timer interrupts as I have stated in this thread.
 
Considering the long pipeline differing latency is a well known common problem with ARMs (check out various forums including official STM32).
Also, changing compiler optimizations in each case produce different latencies but they're always present.

Of course, if you're able to do it precise timings on STM32 output pins, feel free to do so and post scope screenshots.

I will be happy to be proven wrong but ATM I'm firmly standing behind my statement that less complex 8-bit and 16-bit MCUs (AVR, PIC, PIC24, dsPIC etc.) are more suitable for precise timings on output pins (reliably predictable number clock ticks).  
Logged
Pages: [1]
Print
Jump to:  


DISCLAIMER
WE DONT HOST ANY ILLEGAL FILES ON THE SERVER
USE CONTACT US TO REPORT ILLEGAL FILES
ADMINISTRATORS CANNOT BE HELD RESPONSIBLE FOR USERS POSTS AND LINKS

... Copyright © 2003-2999 Sonsivri.to ...
Powered by SMF 1.1.18 | SMF © 2006-2009, Simple Machines LLC | HarzeM Dilber MC