7 years, 3 months ago.

String concatenation

Hello, I am working on creating packets of data for a project, and I have been having a lot of trouble with concatenation using strings. I have tried a few different methods with similar results:

MbedJSONValue library:

Huge memory hog especially with the board i'm using(LPC4088 which only has 96k ram) would run the board out of memory with dynamic allocation of memory for objects.

String Streams:

this method seemed to work well until we scaled it up to our full bandwidth of data, which caused an odd error, Basically what happened is the stream would continue to concatenate until it reached a length at which it would overwrite itself or halt the program all together with plenty of memory left.

Strcat:

I moved to a simpler c method where i would use sprintf to add our data points to the packet and then concatenate that to the end of the packet string, which was pre-allocated with a 10000 character array(just under 30k of ram left) and it ran into a similar problem, at some magic number it would begin to overwrite previous data and not grow past 2200 characters.

I have included some code, I would like to know if i'm running into some limitation of strings on this processor or something? I have both the strcat and string stream method below.

String Stream

string* make_packet(float* float_array, char* time)
{
    	//Create a string *(stream) object to easily format data into JSON format Manually
    	stringstream* stream = new stringstream;
    	//String to hold and return packet to main
    	string* packet_data = new string;
    	//Create start of packet, including message number, and time stamp.
    	*(stream) << "{\"mcount\":" << mcount << ","; //begin packet and insert message number.
   	*(stream) << "\"timestamp\":\"" << time << "\",\"d\":[";
    	//Add all of the data to the packet, append each point to the packet until full desired message length is reached
	//Floats have 3 decimal places
    	stream->precision(3);
    
    	for(int i = 0; i<MSG_LEN; i++)
    	{
        	//create data points/data numbers
        	*(stream) << "{\"p\":" << float_array[i] << ",\"n\":"<< i << "},";
        	*(packet_data) += stream->str();
        	pc.printf("size of string: %i\n", (stream->str()).size());
    	}
    	//Turn string stream into a string, finish the packet and return it to main to be sent.
    	*(packet_data) = stream->str();
    	*(packet_data) += "]}";
	delete stream;
	return packet_data;
}

Strcat

char* make_packet(float (&float_array)[MSG_LEN], char* time)
{
    char* packet_data = new char[10000];
    char temp[100];
    strcpy(packet_data,"\0");
    sprintf(temp, "{\"mcount\":%i,\"timestamp\":\"%s\",\"d\":[", mcount, time);
    //printf("%s", temp);
    strcpy(packet_data,temp);
    strcpy(temp,'\0');
    for(int i = 0; i < MSG_LEN; i++) 
    {
        printf("%04i, ", strlen(packet_data));
        sprintf(temp, "{\"p\":%.3f,\"n\":%i}", float_array[i], i);
        printf("%04i, ", strlen(temp));
        printf("%s, ", temp);
        strcat(packet_data, temp);
        strcpy(temp,'\0');
        if(i<MSG_LEN-1) 
        {
            strcat(packet_data,",");
        }
        printf("%04i, \n", strlen(packet_data));
    }
    strcat(packet_data,"]}");
    return packet_data;                                          //allocated memory is deleted in main.
}

Another option could be to use std::string

#include "mbed.h"
#include "string"

using namespace  std;

int     MSG_LEN;    // message length
int     mcount;     // message number

string  packet_data;

string& make_packet(float* float_array, char* time)
{
    char temp[100];
    packet_data = "{\"mcount\":";
    sprintf(temp, "%i", mcount);
    packet_data += temp;
    packet_data += ",\"timestamp\":";
    sprintf(temp, "%s", time);
    packet_data += temp;
    packet_data += "\",\"d\":[";
    for(int i = 0; i < MSG_LEN; i++) 
    {
        printf("%04i, ", packet_data.length());
        sprintf(temp, "{\"p\":%.3f,\"n\":%i}", float_array[i], i);
        printf("%04i, ", strlen(temp));
        printf("%s, ", temp);
        packet_data += temp;
        if(i < MSG_LEN - 1) 
            packet_data += ",";
        printf("%04i, \n", packet_data.length());
    }
    packet_data += "]}";
    return packet_data; 
}
posted by Zoltan Hudak 05 Jan 2017

What is the output of all your debug statements? Especially what is the lenght of packet_data before/during the error? And is that correct taking the string length of temp into account?

posted by Erik - 05 Jan 2017

Erik, The length of the string changes, it goes from 2125 or something down to 2105 and writes over itself, between the end and the beginning of the for loop the length changes.

Zoltan, I could use a string but as I said memory is limited so I want to use the minimum I can get away with, may try it though.

posted by Daniel Rush 05 Jan 2017

One trick, if you are using sprintf you don't need to then use strcat. Just add the current length of your string on to the the pointer passed to sprintf and it will add the new text on to the end. This avoids the extra buffers and memory copies required when using strcat.

e.g. (not working code but it should give you the idea)

char *stringBuffer[1000];
int strLength = 0;

while (data && (strLength<999)) {
  strLength += sprintf(stringBuffer + strLength  ,"%d",data);
}

The other thing you may want to do is use snprintf to avoid overflow of the string buffer.

#define maxLen 1000

char *stringBuffer[maxLen];
int strLength = 0;

while (data && (strLength<maxLen-1)) {
  strLength += snprintf(stringBuffer + strLength, maxLen-strLength, "%d",data);
}
posted by Andy A 10 Jan 2017

TO WHOM IT MAY CONCERN: The answer to this question is that the library I was using (MQTT) plus the large packet size were causing the heap and stack to run over each other, I believe. The solution to the problem was to break my large packet into two smaller ones and write the function with no dynamic memory allocation and instead of using any concatenation or streams I directly copied each string from one array to the other index, by index.

posted by Daniel Rush 18 Jan 2017

2 Answers

7 years, 3 months ago.

Hello Daniel,
I think there is a memory leak in both samples you provided. You keep creating local new packet_data dynamic arrays (you refer to outside the function) and never delete them.

I forgot to mention that the new character array that is created, is deleted in my main function. As far as I know there is no memory leak and the function runs indefinitely, if there were a leak it would only run 4 times as there is only 40k RAM dynamically available.

posted by Daniel Rush 05 Jan 2017

How do you use the local packet_data pointers in the main (how do you delete them)?
I don't think it's a good idea to return pointers declared and initialized in a function as local and use them in the main. There is no guarantee that the pointer remains valid outside the function's scope (it's rather a coincidence, if it happens).

How does std::string version perform?

posted by Zoltan Hudak 05 Jan 2017

using std::string is going to require a rework of the code as the strcat function requires a char array since it is a C function as opposed to a c++ function. The way I understand it, if you create a pointer to an object in the heap, and pass that pointer back to a main function it should hold it just fine, which it does consistently because heap memory isn't local to functions. I use the standard delete[] packet_data; to delete a char array.

posted by Daniel Rush 05 Jan 2017

Basically you are right. But my concern is about the pointer itself.
I think if you declare the pointer as global, initialize it in the function and then delete the object (created on the heap in the function's body) in the main, it should work.

string* packet_data;    // declare as global pointer (you can use it for deleting the object it point to (for example in the main or other function)

string* make_packet(float* float_array, char* time)
{
    //Create a string *(stream) object to easily format data into JSON format Manually
    stringstream* stream = new stringstream;
    //String to hold and return packet to main
    packet_data = new string;
    //Create start of packet, including message number, and time stamp.
    *(stream) << "{\"mcount\":" << mcount << ","; //begin packet and insert message number.
    *(stream) << "\"timestamp\":\"" << time << "\",\"d\":[";
    //Add all of the data to the packet, append each point to the packet until full desired message length is reached
    //Floats have 3 decimal places
    stream->precision(3);

    for(int i = 0; i<MSG_LEN; i++) {
        //create data points/data numbers
        *(stream) << "{\"p\":" << float_array[i] << ",\"n\":"<< i << "},";
        pc.printf("size of string: %i\n", (stream->str()).size());
    }
    *(stream) << "]}";
    //Turn string stream into a string, finish the packet and return it to main to be sent.
    *(packet_data) = stream->str();
    delete stream;
    pc.printf("packet_data = %s\n", packet_data->c_str());
    return packet_data;
}

int main(void)
{
    ...
    make_packet(..);    // create new packet_data
    ...                 // send packet_data
    delete packet_data;
    ...
}
posted by Zoltan Hudak 05 Jan 2017

This is essentially the same as what I'm doing, but does not address my original question, which was that the for loop would stop concatenating at the same point every time, at around the 93rd or so iteration, the program stops concatenating and overwrites the string. Which is happening in the function, not when it is passed back, all of that has worked fine.

posted by Daniel Rush 05 Jan 2017

Hello Daniel,
I'm sorry but I do not have answer to that question. Maybe someone else will help.
NOTE: On my NUCLEO-F103RB board the for loop throws a runtime error: "Operator new out of memory" after just eight cycles. On the other hand the version with std::string was capable to run hundred iterations with no error.

posted by Zoltan Hudak 06 Jan 2017

I tried using the std::string method and it hangs at runtime no error message just stops running. Not sure what that's about.

posted by Daniel Rush 06 Jan 2017

I'm not sure as well but most likely there is a memory collision. I would recommend you to try the trick proposed by Andy. Seems very efficient.

posted by Zoltan Hudak 11 Jan 2017
7 years, 3 months ago.

I'm not sure if this is related, but i had problems with the LPC4088-QSB board just hanging, when trying to use the Serial class and using device.printf(...) (even very basic stuff and with both USBTX/USBRX and a 'normal' serial port via an external USBtoSerial bridge). The solution in my specific case was to use the RawSerial class instead, which doesn't use Streams (i didn't need them for my application). So maybe the mbed LPC4088 target libraries have a faulty Stream implementation ??

Best regards
Nenad