1. Approach
For small embedded applications String processing seems to be unnecessary. The focus is on data processing, numerical values. But for parameterizing, test outputs and other using of strings is necessary.
More rich embedded applications may deal with Strings, for example:
-
A human interface, simple commands, parameter handling. Commands and parameter are strings.
-
Messages, errors, warnings, process messages stored in plain text, maybe outputted via string-oriented interfaces as serial communication (UART) or Ethernet protocolls.
C++ knows a lot of features for String processing, see https://en.wikipedia.org/wiki/C%2B%2B_string_handling, for example also in the QT library https://doc.qt.io/qt-5/qstring.html. But this String operations uses often dynamic memory which is not recommended for long running embedded control and has limits on simple processors.
Hence the situation is often, that the old-style C functions for String processing are used (strlen, strcpy etc. and printf). That is contained in the C and C++ standard, and seems to be the favored or proper possibility. By the way, it is often also a reason to persist with C instead C++.
The standard C string functions have some disadvantages.
Some replacements exists, often compiler- or platform-specific
as for example strncpy_s
from Microsoft’s Visual Studio versus strncpy
for supposed more safety, which is not true in my mind.
Also gcc offers specific, sometimes proper solutions, which are not standard.
Often some String operations especially on small processors needs to much effort.
A printf
with text processing is not simple in machine code.
It should not be used if not necessary.
The emC library offers own solutions for embedded control.
Because this development has also a longer history (since about 1994), it sometimes also has legacy, or also experience.
2. Definition of String capabilities in applstdef_emC.h
For the sources of src_emC
some compiler switches are used which should be defined
in applstdef_emC.h.
This defines can be used in the user’s source too to adapt it to different platforms:
3. Non 0-terminated StringJc
The zero-terminated string is one of the basic ideas of the C language. Given a string argument to a function is very simple. Only the pointer is necessary. The end is found though with the char '0'
if the String is processed. - All other languages uses a pointer to the character array and their length.
C has a known problem with String overflow. One of the reason is: If a length is not known (the '0' should be firstly searched), then it is not able to calculate.
For example it you use the simple function strlen(text)
: If the text
pointer referes (because of an error in a not yet ready software) a faulty location, which can be filled with for example AA AA …
till the end of the memory, then this function searches a long time, maybe wrapped at address 0, maybe have a null-pointer-termination or not, searches furthermore, needs a long time, and the whole system may crash because time overflow in an interrupt.
Hence: The simple 0-terminated Strings are not proper, or unfortunately, or dangerous.
But it is sensible to use the 0-termination inside a given length. For example there is a buffer for a name of a paramter inside a greater data struct
:
char nameParam[32]; float valueParam;
The name can have till 32 character. But it may be shorten internally by a \0
.
4. Supplementation of unsafe basic string functions
See for example msdn…strcpy-is-unsafe-consider-using-strcpys-instead forum or msdn … strncpy_s Microsoft offers in its Visual Studio compiler instead strncpy(…)
a strncpy_s(…)
. - Though strncpy(…)
is safe (in comparision to strcpy, which is really unsafe). There are many hits in internet about that problem.
The argumentation of strncpy_s(…)
: strncpy(…)
does not guarantee a 0-terminated result string, if the src has exact the same length as the result buffer. But that is not a problem of this funtion (https://linux.die.net/man/3/strncpy). but an understanding problem of usage.
Hence, there is a lot of wild growth, emC offers some own functions in its emC/Base/StringBase.h
and …c
:
4.1. strnlen_emC
The strlen(…)
routine of standard C is really unsafe and should never be used. The known strnlen(…)
is safe. See description in https://man7.org/linux/man-pages/man3/strnlen.3.html. But this is for gcc. Visual Studio warns on using of strnlen(…)
. Other platforms may offer other implementations. Because the routine is real simple and can be adapted for special embedded platforms, it is offered as emC
version too. If emC is in use, this function should be used.
The maximal expected length of a string should always be known. For example it is the length of a buffer containing the String. Then the 0-termination is not necessary if the string fills the whole buffer. To get compatibility with a strlen(…)
in legacy sources, the length value can be set to INT_MAX
. Then the behavior is the same. It may be better to use an expected value of a known memory size of a known value about the maximal used length of strings in this application.
4.2. strncmp_emC
The strcmp(…)
routine of standard C is unsafe because the comparison is not terminated. The strncmp(…)
see https://man7.org/linux/man-pages/man3/strncmp.3.html is safe. But this routine has the capability of improvement, return the position of the faulty character. This is often necessary or (in debug situations) nice to know.
int strncmp_emC(char const* const text1, char const* const text2, int maxNrofChars);
returns 0 if both text are equal till a found \0
or till maxnrofChars. It returns <0
or >0
in the same kind as the original strncmp(…)
, but the absolut value is the position of the first difference count from 1. If the first character on text1[0]
is different, it return 1
or -1
. Comparison of "abcde"
and "abcdx"
returns 5
because the 5th character is different. The strncmp(…)
original C routines guarantees only return a result >0
.
4.3. strpncpy_emC
The strcpy(…)
routine of standard C is unsafe and should never be used because on faulty arguments the destination is uncontrolled overridden which can destroy the whole application. A simple example shows it:
char srcBuffer[6]; //declared as not initialized //... and forgotten to write or the expected routine was not called. char dstBuffer[20]; //may be large enough ...? strcpy(dstBuffer, srcBuffer); //seems to be save in the eyes of the developer
This disturbes the whole memory if the srcBuffer
does not contain a '\0'
character and it is not followed by any 0 in the memory. The reason is lightweigth and the result is catastrophical. Do never use the simple strcpy(…)
.
Another example, similar:
char srcBuffer[6] = {0}; //better, it is initialized. strcpy(srcBuffer, "ident"); //correct. Safe. 0-terminated srcBuffer[5] = nr; //user expect "ident5" or such, the 0-termination is missed. char dstBuffer[20]; //may be large enough ...? strcpy(dstBuffer, srcBuffer); //seems to be save in the eyes of the developer
This is also disturbing, because of the missing 0-termination by a maybe thinking error on the srcBuffer
. Do never use the simple strcpy(…)
.
The strncpy(…)
is save, but not in the eyes of all developer, because it is sometimes complicated:
char dstBuffer[20]; //may be large enough ...? but not initialized strncpy(dstBuffer, srcBuffer, sizeof(srcBuffer));
Following the problems of srcBuffer
above, this reduces the number of character copied from srcBuffer to the given length of srcBuffer
, which prevents the error above. But there are two further problems:
-
1) The content in
dstBuffer
may not be 0-terminated. -
2)
strncpy
may be firstly used to prevent overriding of too many memory in the destination, not to determine the number of characters from src, or ? It is not definitely defined.
It seems to be, especially for the effect 1) Microsofts Visual Studio offers an own solution (tested on Visual Studio 2015, on an updated version at 2021-05-23):
char dst[40] = {0}; //initializes the whole content with 0 #ifdef __COMPILER_IS_MSVC__ dst[6] = 'Q'; int okMsc = strncpy_s(dst, "abcde", 5); ...
Because of the known argument meanings of strncpy
this routine might copy the given five characters. But it sets an additionally terminating \0
on the 6th position and overrides the 'Q'
, it copies 6 characters instead 5 as in the originally strncpy(…)
. In Microsoft’s thinking the 0-termination is more important than compatibility with other given standards. The thinking may be, that the length argument is the length of the source. Then Microsoft’s behavior is understandable. But the usual problem is: Overriding the destination should be prevented. Hence the length argument should be follow the situation on destination, more exact it should be regard both, regarding an exact given number of characters for not 0-terminated source strings, and the size of the destination buffer. The behavior of Microsoft’s solution is wrong in respect to the destination length.
The offered emC solution: strpncpy_emC(…):
The Version of strpncpy_emC
regards all necessities to secure copy and concatenate strings with or without zero termination.
See in emC/Base/StringBase_emC.h
:
int strpncpy_emC(char* dst, int posDst, int zDst, char const* src, int zSrc);
dst
is the destination buffer, which will be written from posDst
. zDst
is the length which can be filled with the characters. It determines the used and written memory. src
is the source string with either the given length in zSrc
or till a found '\0'
character. Especially if zSrc = -1
or <0
the zSrc
is used till 0-termination. But never the dst
overflows because it is determined by zDst
. The return value is the number of copied character without the maybe copied 0 as termination. It can be simple used to add to posDst
for appending. Last not least this operation does not spend unnecessary calculation time to fill the destination with 0 till end. Usual it should be filled with an operation before, respectively one 0-terminating '\0'
character is enough.
If there is a difference in arguments, especially a given src
string cannot be copied as a whole to the given dst
because the number of possible character to copy (zdst - posDst)
is too less, then it can be seen as error (the given string cannot be copied) or as expected behavior (the number of copied strings is lesser because size of destination). If it is seen as error, then either an exception should be thrown, or any simple testable output should be given. But it is defined as behavior. The resulting string can be simple checked by debugging and the result is obviously. Hence no exception is given. It may be possible to compare the return value with a known length of source to detect a truncation:
char dstBuffer[40] = {0}; int zDst = sizeof(dstBuffer); //it is 40 int nrofCharsCopied = strpncpy_emC(dstBuffer, 0, zDst, src, -1); if(src[nrofCharsCopied] !=0) { ....
Here it is expected and tested that src
is copied always till its 0-termination.
The operation is save, never an undesired memory location will be overridden. See the example for concatenation:
char dstBuffer[40] = {0}; int zDst = sizeof(dstBuffer); //it is 40 int posDst = strpncpy_emC(dstBuffer, 0, zDst, "first string", -1); if(posDst < zDst) { dstBuffer[posDst++] = nr; } //write a number as char between posDst += strpncpy_emC(dstBuffer, posDst, zDst, " ", 2); //typical append //append a numeric value posDst += toString_int32_emC(dstBuffer + posDst, zDst - posDst, nr2, -1); posDst += strpncpy_emC(dstBuffer, posDst, zDst, " kg\n", -1); //typical append
This is pure C. Using C++ with overridden operators is more simple to read, but it bases on the same routines. As you see the argument posDst
makes it easier for concatenation. The called routine toString_int32_emC(…)
does not support that handling of posDst
, instead a pointer add and a subtract is need on call, which is equivalent on execution but a little bit more complex in the source.
4.4. strnchr_emC, searchChar[Back]_emC
Adequate remarks as for strcmp
etc. are valid for strchr
. This function is not save, because it may crash if the source string is not found as 0-terminated by any mistake. The original routine in emC is:
int searchChar_emC ( char const* text, int zText, char cc);
The length of the text is terminated. A 0-termination is regarded if the 0 is found, but not necessary. The return value is the position of the found character, or -1 if not found. It follows known and proven Java concepts.
char const* strnchr_emC ( char const* text, int cc, int maxNrofChars)
is also supported. The difference is the return value only. It is compatible with the knwon strnchr(…)
of standard-C.
searchCharBack_emC ( char const* const text, char cc, int fromIx);
is the adequate solution for searching the last occurence of cc
in the given text
.
4.5. searchString[Back]_emC
This is the replacement of the strstr(…)
function, but with more simple usage following the Java approach in java.lang.String.indexOf(String)
.
int searchString_emC ( char const* text, int maxNrofChars, char const* search, int zs)
As the other replacements a 0-termination is not need. search
is terminated either by a '\0'
or by the given length zs
. The text where the searching occurs is also determined either by a '\0'
or by the given length maxNrofChars
. The return value is either the found position or -1 if not found.
int searchStringBack_emC ( char const* text, int maxNrofChars, char const* search, int zs)
is the adquate function searching the last occurence of the given String. It is adequate the Java core function java.lang.String.lastIndexOf(String)
.
4.6. searchAnyChar[Back]_emC
This is the replacement of the strspn(…)
function, but with more simple usage.
int searchAnyChar_emC ( char const* text, int maxNrofChars, char const* chars, int zs)
As the other replacements a 0-termination is not need. search
is terminated either by a '\0'
or by the given length zs
. The text where the searching occurs is also determined either by a '\0'
or by the given length maxNrofChars
. The return value is either the found position or -1 if not found.
int searchAnyCharBack_emC ( char const* text, int maxNrofChars, char const* chars, int zs)
is the adquate function searching the last occurrence of one char of the chars
in the given String.
4.7. Necessity of printf? instead toString_int32/float_emC
The printf
is a core of C usage, see all "Hello world" examples. But printf
is a really complex operation. If it is used, it increases the amount of memory, especially for poor embedded controller with small capabilities.
What is really necessary? Sometimes numeric values should be present as string also in poor controllers for example to output values in a UART communication. The whole capability of the C standard printf
is often too much.
The simple routine
int toString_int32_emC(char* buffer, int zBuffer, int32 value , int radix, int minNrofCharsAndNegative);
offers a conversion from numeric value to string in decimal or hexadecimal presentation. It prevents usage of division because the division may be a more expensive operation in calculation time on some processors.
This solution is enqueued in the solution of string concatenations. It prevents the necessity of including library functions for printf
which are often more extensive.
4.8. Scanning of numeric values: parse[Int|Float]Radix_emC
This is the counterpart of scanf
approaches, more simple as the original and standard scanf(…)
.
int parseIntRadix_emC ( const char* src, int zSrc, int radix, int* parsedChars , char const* addChars);
It returns the number. This function is a jack of all, respectively the necessities, though it is short in code (200 Bytes on a TI320 controller) and short in execution. addChars
controls the abilities:
-
The
parsedChars
is the second output of the function via the given integer pointer for the parsed number of characters for the digit. It is usual necessary in the parsing process, but can be omitted too if not necessary, by given anull
pointer. -
If
radix = 10
it parses decimal,radix=16
or=8
hexa or octal, or all other radix. An abbreviated radix, not 10 or 16 may be unnecessary, but the number is a simple number for calculation and comparison. -
The hexa digits are
'a'..'f' or ’A'..'F'
, and tillz
for the radix 26 (if desired). -
If
addChars =null
or empty, it parses only the digits. -
A space as first char in
addChars =" "
forces skipping over leading spaces and tabs, a"\n"
on first position forces skipping over all leading white spaces. -
A minus
addChars ="-"
on the first position or on second position afteraddChars =" -"
or="\n-" accepts a ’-'
for a negative number, a plus on that positionsaddChars =" +"
accepts also a'+'
character (positive value) which is generally unnecessary but it is hence possible. -
Space or newline after the minus or plus
addChars ="- "
accepts spaces or tabs after the sign till start of the number, same foraddChars ="+ "
. The combinationaddChars ="\n+ "
accepts white spaces before the number and spaces or tabs between sign and number. -
A
'x'
after this designations or on startaddChars ="x"
accepts switching to hexa numbers on a given0x
designation in the number. It meansaddChars ="\n+ x"
accepts whithe spaces before sign, the sign, spaces and tabs after the sign, and then a0x
to detect a hexa number. -
All other characters in
addChars =" '| ,"
are characters which can be placed between the digits. Especially the space to group digits in one number may be sensitive for some applications. A space to separate should not written on the first positions see above, but on following positions:-
addChars =" "
accepts leading spaces and tabs, and then spaces (not tabs) between the digits, two spaces are given. -
addChars ="1 "
does not accept leading spaces (the space is not left) but spaces as separator inside the digit. The "1" has no meaning, it cannot be a separator, it is a digit. It is a little helper to have the space not left. -
addChars =" x "
accepts for example ` 0x 34 56 ca FE` as hexa number. -
addChars =" x"
accepts only ` 0x3456caFE`, not with space between the digits, but -
addChars =" x'"
accepts also ` 0x3456’caFE` for better readablity.
-
There may be some combinations which are not sensitive, but formally possible. The operation is not intend as a sharp parser, it is intend to read numbers in a more free style, for example in a parameter list:
param1 = + 456.45 param2 = 0x100 key = 234 356 357 key2 = 0xcafe'affe
The numbers are parsed in any case after the '='
, all this forms are accepted with addChars =" + x '"
. The white space form with "\n"
is not necessary in most cases but possible if a strong whitespace thinking is given. The floating number is parsed with
float parseFloat_emC ( const char* src, int size, int* parsedChars);
which uses the parseIntRadix_emC(…)
for the integer and fractional part and for the exponent.
5. StringJc as unique string representation with capabilities
The struct StringJc
in emC presents a String with its length maybe in several forms
(depending of the compiler switches 'DEF_NO_StringUSAGE' and 'DEF_CharSeqJcCapabilities',
see #applstdef_emC_h.html#DEF_StringCapabilities.
5.1. StringJc definition itself
It is a data struct
consisting of two values, a pointer to the string itself
and an integer value containing the length and some status bits:
One of the basic ideas in the development was: It should be able to return by value or also used for call by value:
StringJc myStringOperation ( StringJc inp ) { StringJc ret = ... //build the String maybe from inp return ret; }
See also types_def_common_h.html#AddrVal_emC.
This approaches are used also for the type StringJc. But depending on more capabilities the address is a union:
It is a little bit complicated because conditional compiling, but it is consequently:
-
If
DEF_NO_StringUSAGE
is defined, a StringJc can only be a simpleconst
character sequence. It can be 0-terminated or not. Often it is 0-terminated, but the length is given in theStringJc
instance. -
If
DEF_NO_StringUSAGE
is not defined, the String can also be stored in aStringBuilderJc_s
instance. This is defined by status bits inval
, it has a constant value of0x3ffe
, see also next chapter formLength_StringJc
. -
If
DEF_CharSeqJcCapabilities
is defined, a String can be gotten from anObjectJc
instance which implements the CharSeq interface. This is similar in Java. See chapter Vtbl_CharSeqJc: CharSequence in C language. -
If C++ usage is active, the CharSeq can also be implement by virtual operations of the named class.
It means, a StringJc
can be:
-
a simple C-String, maybe 0-terminated or not, as const string, unmutual as in Java.
-
a reference to a buffer to prepare a String, see chapter StringBuilderJc: Buffer to prepare Strings. Then it is a mutual String.
-
a reference to any Object, which has an operations due to
Vtbl_CharSeqJc
, see #Vtbl_CharSeqJc -
a reference to any C++ instance of type
CharSequenceJcpp
which is an interface to the C++ CharSequence with virtual operations.
This offers some capabilities of String processing
which are more affine to Java language then to C/++ standards.
Especially the simple form only using the str
element of this union is very simple
also proper for small, poor processors.
It is the pointer to a string, not necessary zero-terminated,
and the length is given in val.
5.2. Meaning of the val, mLength_StringJc
The val
of this struct contains not only the length of the String but also some control bits. For simple usage 16 bit are sufficient, for more capability 32 bit for this val
value are necessary:
The basic definition to evaluate this val
is
//in applstdef_emC.h or in compl_adaption.h only if necessary: #define mLength_StringJc 0x00003fff
This is the standard definition, also established in emC/Base/StringBase_emC.h
;
#ifndef mLength_StringJc #define mLength_StringJc 0x00003fff #endif
Depending on this value some other definitions are contains in emC/Base/StringBase_emC.h
:
#ifndef DEF_CharSeqJcCapabilities #define mVtbl_CharSeqJc 0 #define kIsCharSeqJc_CharSeqJc 0 #define kMaxNrofChars_StringJc (mLength_StringJc -2) #define mIsCharSeqJcVtbl_CharSeqJc 0 #else #define mVtbl_CharSeqJc (mLength_StringJc >>2) #define kIsCharSeqJc_CharSeqJc (mLength_StringJc -2) #define kMaxNrofChars_StringJc ((mLength_StringJc & ~mVtbl_CharSeqJc)-1) #define mIsCharSeqJcVtbl_CharSeqJc (mLength_StringJc & ~mVtbl_CharSeqJc) #endif #define kIs_0_terminated_StringJc (mLength_StringJc) #define kIsStringBuilder_CharSeqJc (mLength_StringJc -1) #define mNonPersists__StringJc (mLength_StringJc +1) #define mThreadContext__StringJc ((mNonPersists__StringJc)<<1)
This definitions based on mLength_StringJc == 0x3fff
defines the following values:
Bits:
-
0x8000: mThreadContext__StringJc
: This bit describes the location of a StringJc inside the ThreadContext. It is allocated thread local and should be deallocated after usage. -
0x4000: mNonPersists__StringJc
: If this bit is set, the string may be changed. It is not an unmutual String.
Value mask with mLength_StringJc == 0x3fff
:
-
0x3fff: kIs_0_terminated_StringJc
: This value means, the length of the String is not known yet, but the string is zero terminated. The length can be determined by searching the '0' character. This definition makes it easy to define aconst StringJc
from a literal with initializer list:#define INIZ_z_StringJc(TEXT) { TEXT, kIs_0_terminated_StringJc}
-
0x3ffe: kIsStringBuilder_CharSeqJc
: A reference to aStringBuilderJc
is used, the string is in the buffer of theStringBuilderJc
. The length is determined by theStringBuilderJc
data.
If DEF_CharSeqJcCapabilities
is not set, then it is more simple.
-
0x3ffd: kMaxNrofChars_StringJc
: This is the maximal value for a length of a String. If the value maskmLength_StringJc == 0x3fff
with is0 .. kMaxNrofChars_StringJc
then it is an unmutual or mutualchar const*
string with this given length.
With this bits designation a StringJc
reference can present all of this named strings.
The simple case is always possible, the unmutual char const*
simple String.
If DEF_CharSeqJcCapabilities
is given, then the StringJc can refer to an ObjectJc instance which implements the CharSeq function pointer table. The ObjectJc*
pointer part of the union is used. The ObjectJc
instance should offer 3 operations to get the length, any index chars and a sub sequence likewise as java.lang.CharSequence
interface. But the StringJc can also be a simple unmutual char const*
or a StringBuilderJc
instance, of course.
-
3ffd: kIsCharSeqJc_CharSeqJc
If this value is set masked withmLength_StringJc
, then the reference refers to anObjectJc
instance which should contain a function pointer table toCharSeqJc
routines. The function pointer table is gotten from the instance by callinggetVtbl_ObjectJc(obj, sign_Vtbl_CharSeqJc)
. This searches the function pointer table inside theClassJc
type information. -
0x3000: mIsCharSeqJcVtbl_CharSeqJc
; If the val mask with this bits is set, then theval & mLength_StringJc
is in range0x3000..0x3ffc
. Theval && mVtbl_CharSeqJc
is the index to a function pointer table which is used to implement dynamic call on runtime in C language or in C++ without using virtual. The advantage for such aStringJc
is: The index is already built (elsewhereint getPosInVtbl_ObjectJc(othiz, sign_Vtbl_CharSeqJc);
should be called which needs a little effort. This variant should only be used for local values (hold in stack) which are more safe then anywhere in data memory. Elsewhere there can be the same problems as using the virtual mechanism in C++: Disturbed data can force a crash of programm execution. -
0x0fff: mVtbl_CharSeqJc
This is the mask for theposInVtbl
for a CharSeqJc Object. See examples and chapter…TODO -
0x2fff: kMaxNrofChars_StringJc: 0x2fff
If the(val & mLength_StringJc) ⇐ kMaxNrofChars_StringJc
then it is achar const*
unmutable String with this given length.
This definitions should not be seen as too complicated and sophisticated. Really it is simple:
-
Often the simple form of
StringJc
is used only (const char*
). -
If necessary the String can be more complex.
-
Some access operations (next chapter) know this bits and use it.
5.3. Build operations for StringJc in in emC/Base/StringBase_emC.h
This operations can be used though DEF_NO_StringUSAGE
is set.
Some operations are already contained in types_def_common.h
,
hence they are defined without including emC/Base/StringBase_emC.h
.
This definitions are used also in other basic features, especially exception handling:
The application of the INIZ
macros is simple.
They are macros because the definition should be done already on const definition level
of the compiler.
Note: We have not or won’t be use nice automation as given in C++ with initialization code
which runs before main
or / and usage of heap memory.
StringJc myConstString1 = INIZ_text_StringJc("content of this string1"};
This above defines two constant words in the const memory area with the reference to the constant string literal and the length of the string literal. It is a const memory area definition. C++ experts for PC programming may not know what is meant, but embedded C developers knows it.
Do not write:
StringJc myConstString2 = INIZ_text_StringJc(refToAString);
Unfortunately the C and also C++ compiler don’t distinguish between a String literal
which is formally a char const*
, a pointer, and the really pointer to any other String.
The length is faulty calculated with the sizeof(refToAString)
, here it’s the
number of memory words of the pointer. The length of the referenced String cannot be calculated.
The macro
StringJc myString3 = INIZ_StringJc(refToAString, itsLength;
can be used especially if a StringJc
instance should be created and initialized
in the stack, and the pointer and the length to any other given character array
is known in variables.
5.4. Access functions in emC/Base/StringBase_emC.h
This access functions can be used also if DEF_NO_StringUSAGE
is set.
Adequate the possibilities are reduced, but the operations are offered.
5.4.1. getCharsAndLength_StringJc
This operations can be used though DEF_NO_StringUSAGE
is set.
=>source: src_emC/emC/Base/StringBase_emC.h[tag=getCharsAndLength_StringJc] /* Gets the char-pointer and the number of chars stored in a StringJc. * This is a common way to get a the content of a StringJc-Instance if a char-pointer is necessary. * The char-pointer is either the immediately reference as stored in the StringJc, * or it is a pointer to the content of a referenced StringBuilderJc. * The referenced character array may be persistent or not. * In generally, the user should not store and use the return value persistently outside of the StringJc-concept. * The routine is given to support evaluating the chars immediately in a C-like-way. * @param length destination variable (reference) to store the realy length. * The destination variable will be set anytime. * If it is -1 then the StringJc is a CharSequence interface. * @return the pointer to the chars. It may be null if the StringJc contains a null-reference. * Or it is a CharSequence interface */ extern_C char const* getCharsAndLength_StringJc ( StringJc const* thiz, int* length);
offers a pointer to the char sequence and the length information for both case,
an immediately referenced character sequence or a character sequence in a StringBuilderJc
.
For usage it is recommended to use this reference immediately for example to copy the content or as argument for a processing function which does not store the reference.
Storing the reference is a prone of error because it is not asserted that the character sequence
is persistent. You should use the StringJc
instance itself to use as reference.
See also