Another memory fun

Hey again)

Now I have next code:

; ModuleID = ‘sample.lz’
@.str1 = internal global [8 x i8] c" world!\00" ; <[8 x i8]> [#uses=1]
@.str2 = internal global [8 x i8] c"hello, \00" ; <[8 x i8]
> [#uses=1]
@.str7 = internal global [21 x i8] c"welcome to out hall!\00" ; <[21 x i8]*> [#uses=1]

declare i32 @puts(i8*)

declare i8* @strcat(i8*, i8*)

declare void @llvm.memcpy.i32(i8*, i8*, i32, i32)

define i32 @main() {
mainBlock:
%.str3 = getelementptr [8 x i8]* @.str2, i64 0, i64 0 ; <i8*> [#uses=1]
%.str4 = getelementptr [8 x i8]* @.str1, i64 0, i64 0 ; <i8*> [#uses=1]
%tmp5 = call i8* @strcat( i8* %.str3, i8* %.str4 ) ; <i8*> [#uses=1]
%tmp6 = call i32 @puts( i8* %tmp5 ) ; [#uses=0]
%.str8 = getelementptr [21 x i8]* @.str7, i64 0, i64 0 ; <i8*> [#uses=1]
%tmp9 = call i32 @puts( i8* %.str8 ) ; [#uses=0]
ret i32 0
}

After compilation I see next(without %):

%
hello, world!
world!

what is the trouble now?
but next code runs ok:
; ModuleID = ‘sample.lz’
@.str1 = internal global [7 x i8] c"father\00" ; <[7 x i8]> [#uses=1]
@.str2 = internal global [8 x i8] c"mother \00" ; <[8 x i8]
> [#uses=1]

declare i32 @puts(i8*)

declare i8* @strcat(i8*, i8*)

declare void @llvm.memcpy.i32(i8*, i8*, i32, i32)

define i32 @main() {
mainBlock:
%.str3 = getelementptr [8 x i8]* @.str2, i64 0, i64 0 ; <i8*> [#uses=1]
%.str4 = getelementptr [7 x i8]* @.str1, i64 0, i64 0 ; <i8*> [#uses=1]
%tmp5 = call i8* @strcat( i8* %.str3, i8* %.str4 ) ; <i8*> [#uses=1]
%tmp6 = call i32 @puts( i8* %tmp5 ) ; [#uses=0]
ret i32 0
}

After running:
%
mother father

It’s ok, but in prev. example (when I call strcat more than one times) program works incorrectly
help me please…

Best regards,
Zalunin Pavel

It's invalid for the same reason that

   char *foobar = strcat("foo", "bar");

is invalid in C. Please make sure you understand what you're asking LLVM to do before going any further down this path. A good approach is to write out the correct code in C and then use llvm-gcc (or the demo page

You should look more closely at the man page for strcat(). You have to have sufficient space in the buffer for the first argument. Otherwise you're trashing memory.

Brendan Younger

hm… I think, that is valid in c

but next code too doesn’t works right:
; ModuleID = ‘sample.lz’
@.str1 = internal global [6 x i8] c"world\00" ; <[6 x i8]> [#uses=1]
@.str2 = internal global [7 x i8] c"hello \00" ; <[7 x i8]
> [#uses=1]
@.str7 = internal global [7 x i8] c"father\00" ; <[7 x i8]> [#uses=1]
@.str8 = internal global [8 x i8] c"mother \00" ; <[8 x i8]
> [#uses=1]

declare i32 @puts(i8*)

declare i8* @strcat(i8*, i8*)

declare i32 @strlen(i8*)

declare void @llvm.memcpy.i32(i8*, i8*, i32, i32)

define i32 @main() {
mainBlock:
%str3 = getelementptr [7 x i8]* @.str2, i64 0, i64 0 ; <i8*> [#uses=2]
%str4 = getelementptr [6 x i8]* @.str1, i64 0, i64 0 ; <i8*> [#uses=1]
call i8* @strcat( i8* %str3, i8* %str4 ) ; <i8*>:0 [#uses=0]
%tmp6 = call i32 @puts( i8* %str3 ) ; [#uses=0]
%str9 = getelementptr [8 x i8]* @.str8, i64 0, i64 0 ; <i8*> [#uses=2]
%str10 = getelementptr [7 x i8]* @.str7, i64 0, i64 0 ; <i8*> [#uses=1]
call i8* @strcat( i8* %str9, i8* %str10 ) ; <i8*>:1 [#uses=0]
%tmp12 = call i32 @puts( i8* %str9 ) ; [#uses=0]
ret i32 0
}

writes:
%
hello world
mother orld

I tried decompile code:
main(int argc, char **argv) {
char str1 = "mother ";
strcat(str1, “father”);
return 0;
}

decompiler gives to me code, in this code string " mother\0" presents as:

%str1 = 
alloca [8 x i8], align 16		; <[8 x i8]*> [#uses=9]
	%tmp1 = getelementptr [8 x i8
]* %str1, i32 0, i32 0		; <i8*> [#uses=2]
	store i8 109, 
i8* %tmp1, align 16
	%tmp4 = getelementptr [8 x i8]* %str1, i32 0, 
i32 1		; <i8*> [#uses=1]
	store i8 111, i8* %tmp4, align 1
	%tmp7 = 
getelementptr [8 x i8]* %str1, i32 0, i32 2		; <i8*> [#uses=1]
	
store i8 116, i8* %tmp7, align 1
	%tmp10 = getelementptr [8 x i8]* %str1, 
i32 0, i32 3		; <i8*> [#uses=1]
	store i8 104, i8
* %tmp10, align 1
	%tmp13 = getelementptr [8 x i8]* %str1, i32 0, i32 4		; <
i8*> [#uses=1]
	store i8 101, i8* %tmp13, align 1
	%tmp16 = getelementptr [8 x 
i8]* %str1, i32 0, i32 5		; <i8*> [#uses=1]
	store 
i8 114, i8* %tmp16, align 1
	%tmp19 = getelementptr [8 x i8]* %str1, i32 0, 
i32 6		; <i8*> [#uses=1]
	store i8 32, i8* %tmp19, align 1
	%tmp22 = 
getelementptr [8 x i8]* %str1, i32 0, i32 7		; <i8*> [#uses=1]

	
store i8 0, i8* %tmp22, align 1

it’s looks funny, can you say another less complex way to do this operation?
Thanks

Best regards,
Zalunin Pavel

Zalunin Pavel wrote:

hm.... I think, that is valid in c
[...]
but next code too doesn't works right:
I tried decompile code:
main(int argc, char **argv) {
  char str1 = "mother ";
  strcat(str1, "father");
  return 0;
}

Valid C doesn't mean only that it compiles, but also that you are
properly using library functions.
Consult the manpage for strcat(3): "the dest string must have enough
space for the result";
in your example it doesn't have.

Best regards,
--Edwin

well this is invalid in c as u need to allocate enough memory to be able to copy the 2nd string , if not enough allocated mem is available then it will overwrite address available, corrupting the memory. :slight_smile:

regards
faraz

Zalunin Pavel wrote:

hm.... I think, that is valid in c

[snip]

I tried decompile code:
main(int argc, char **argv) {
  char str1 = "mother ";
  strcat(str1, "father");
  return 0;
}

This is valid C but you forget that str1 is not magically expanded by strcat. It starts out as, and remains a char array with 8 elements.

decompiler gives to me code, in this code string " mother\0" presents as:

%str1 = alloca [8 x i8], align 16 ; <[8 x i8]*> [#uses=9]
  %tmp1 = getelementptr [8 x i8
]* %str1, i32 0, i32 0 ; <i8*> [#uses=2]
  store i8 109, i8* %tmp1, align 16
  %tmp4 = getelementptr [8 x i8]* %str1, i32 0, i32 1 ; <i8*> [#uses=1]
  store i8 111, i8* %tmp4, align 1
  %tmp7 = getelementptr [8 x i8]* %str1, i32 0, i32 2 ; <i8*> [#uses=1]
  
store i8 116, i8* %tmp7, align 1
  %tmp10 = getelementptr [8 x i8]* %str1, i32 0, i32 3 ; <i8*> [#uses=1]
  store i8 104, i8
* %tmp10, align 1
  %tmp13 = getelementptr [8 x i8]* %str1, i32 0, i32 4 ; <
i8*> [#uses=1]
  store i8 101, i8* %tmp13, align 1
  %tmp16 = getelementptr [8 x i8]* %str1, i32 0, i32 5 ; <i8*> [#uses=1]
  store i8 114, i8* %tmp16, align 1
  %tmp19 = getelementptr [8 x i8]* %str1, i32 0, i32 6 ; <i8*> [#uses=1]
  store i8 32, i8* %tmp19, align 1
  %tmp22 = getelementptr [8 x i8]* %str1, i32 0, i32 7 ; <i8*> [#uses=1]

store i8 0, i8* %tmp22, align 1

it's looks funny, can you say another less complex way to do this operation?
Thanks

Another way:

define i32 @main(i32, i8**) {
entry:
         %argc = alloca i32 ; <i32*> [#uses=1]
         store i32 %0, i32* %argc
         %argv = alloca i8** ; <i8***> [#uses=1]
         store i8** %1, i8*** %argv
         %retval = alloca i32 ; <i32*> [#uses=3]
         store i32 0, i32* %retval
         %str1 = alloca [8 x i8] ; <[8 x i8]*> [#uses=2]
         bitcast [8 x i8]* %str1 to i8* ; <i8*>:2 [#uses=1]
         call void @llvm.memcpy.i32( i8* %2, i8* getelementptr ([8 x i8]* @.str, i32 0, i32 0), i32 ptrtoint (i8* getelementptr (i8* null, i32 1) to i32), i32 0 )
         bitcast [8 x i8]* %str1 to i8* ; <i8*>:3 [#uses=1]
         call i8* @strcat( i8* %3, i8* getelementptr ([7 x i8]* @.str1, i32 0, i32 0) ) ; <i8*>:4 [#uses=0]
         store i32 0, i32* %retval
         br label %return

return: ; preds = %entry
         load i32* %retval ; <i32>:5 [#uses=1]
         ret i32 %5
}

It will still segfault, however. :wink:

-Rich

Yes, I agree with you

but why this code don’t work:

; ModuleID = ‘sample.lz’
@.str1 = internal global [6 x i8] c"world\00" ; <[6 x i8]> [#uses=1]
@.str2 = internal global [7 x i8] c"hello \00" ; <[7 x i8]
> [#uses=1]
@.str7 = internal global [7 x i8] c"father\00" ; <[7 x i8]> [#uses=1]
@.str8 = internal global [8 x i8] c"mother \00" ; <[8 x i8]
> [#uses=1]

declare i32 @puts(i8*)

declare i8* @strcat(i8*, i8*)

declare i32 @strlen(i8*)

declare void @llvm.memcpy.i32(i8*, i8*, i32, i32)

define i32 @main() {
mainBlock:
%str3 = getelementptr [7 x i8]* @.str2, i64 0, i64 0 ; <i8*> [#uses=2]
%str4 = getelementptr [6 x i8]* @.str1, i64 0, i64 0 ; <i8*> [#uses=1]
call i8* @strcat( i8* %str3, i8* %str4 ) ; <i8*>:0 [#uses=0]
%tmp6 = call i32 @puts( i8* %str3 ) ; [#uses=0]
%str9 = getelementptr [8 x i8]* @.str8, i64 0, i64 0 ; <i8*> [#uses=2]
%str10 = getelementptr [7 x i8]* @.str7, i64 0, i64 0 ; <i8*> [#uses=1]
call i8* @strcat( i8* %str9, i8* %str10 ) ; <i8*>:1 [#uses=0]
%tmp12 = call i32 @puts( i8* %str9 ) ; [#uses=0]
ret i32 0
}

so, thanx to all…

now I understand my fault and now I have another question:

how I can write those code with using API:

%final = alloca [256 x i8], align 16 ; <[256 x i8]> [#uses=1]
%final1 = getelementptr [256 x i8]
%final, i32 0, i32 0 ; <i8*> [#uses=2]
call void @llvm.memcpy.i32( i8* %final1, i8* getelementptr ([3 x i8]* @.str, i32 0, i32 0), i32 3, i32 1 )
%tmp5 = call i8* bitcast (i8* (i8*, i8*)* @strcat to i8* (i8* noalias , i8* noalias ))( i8 %final1 noalias , i8* getelementptr ([4 x i8]* @.str1, i32 0, i32 0) noalias ) ; <i8*> [#uses=0]
ret i32 1

I interested for last string :
%tmp5 = call i8* bitcast (i8* (i8*, i8*)* @strcat to i8* (i8* noalias , i8* noalias ))( i8 %final1 noalias , i8* getelementptr ([4 x i8]* @.str1, i32 0, i32 0) noalias ) ; <i8*> [#uses=0]

Best Regards,
Zalunin Pavel

I know about BitCastInst class first parameter it’s function strcat declaration , and second is Type* - what kind of type I must use?

Zalunin Pavel wrote:

but why this code don't work:

; ModuleID = 'sample.lz'
@.str1 = internal global [6 x i8] c"world\00" ; <[6 x i8]*> [#uses=1]
@.str2 = internal global [7 x i8] c"hello \00" ; <[7 x i8]*> [#uses=1]
@.str7 = internal global [7 x i8] c"father\00" ; <[7 x i8]*> [#uses=1]
@.str8 = internal global [8 x i8] c"mother \00" ; <[8 x i8]*> [#uses=1]

declare i32 @puts(i8*)

declare i8* @strcat(i8*, i8*)

declare i32 @strlen(i8*)

declare void @llvm.memcpy.i32(i8*, i8*, i32, i32)

define i32 @main() {
mainBlock:
        %str3 = getelementptr [7 x i8]* @.str2, i64 0, i64 0 ; <i8*> [#uses=2]
        %str4 = getelementptr [6 x i8]* @.str1, i64 0, i64 0 ; <i8*> [#uses=1]
        call i8* @strcat( i8* %str3, i8* %str4 ) ;

Right here you are copying str1 the memory address following the end of str2. Notice that str3 is a pointer to a 7 char array. It doesn't get bigger. You are doing something that is undefined.

You need:
     char result[100]; // big enough not to overflow.
     strcpy (result, "hello ");
     strcat (result, "world");

-Rich

but why this code don't work:

It does work, but you wrote code that violates the C standard and, therefore, it has undefined behavior--code that compiles and code that actually works are two separate things. For instance, on my machine, it produces code that looks like this in the DATA section:

         .data
_.str1: ; '.str1'
         .asciz "world"

_.str2: ; '.str2'
         .asciz "hello "

_.str7: ; '.str7'
         .asciz "father"

_.str8: ; '.str8'
         .asciz "mother "

With the first strcat, you overwrote the "father" string ("_.str7" in this example) with the "world" string ("_.str1" here). Boom! instant undefined behavior. You're lucky; it could have resulted in reformatting your hard drive. :slight_smile:

-bw

I'm somewhat new here, but if I'm wrong, hopefully someone will chime in :slight_smile:

but why this code don't work:

; ModuleID = 'sample.lz'
@.str1 = internal global [6 x i8] c"world\00" ; <[6 x i8]*> [#uses=1]
@.str2 = internal global [7 x i8] c"hello \00" ; <[7 x i8]*> [#uses=1]
@.str7 = internal global [7 x i8] c"father\00" ; <[7 x i8]*> [#uses=1]
@.str8 = internal global [8 x i8] c"mother \00" ; <[8 x i8]*> [#uses=1]

All of the strings here are allocated with exact sizes for their contents...

declare i32 @puts(i8*)

declare i8* @strcat(i8*, i8*)

declare i32 @strlen(i8*)

declare void @llvm.memcpy.i32(i8*, i8*, i32, i32)

define i32 @main() {
mainBlock:
        %str3 = getelementptr [7 x i8]* @.str2, i64 0, i64 0 ; <i8*> [#uses=2]
        %str4 = getelementptr [6 x i8]* @.str1, i64 0, i64 0 ; <i8*> [#uses=1]
        call i8* @strcat( i8* %str3, i8* %str4 ) ; <i8*>:0 [#uses=0]

And here, you're attempting to call strcat on "hello " with "world". strcat does not create a new string, it just writes to the first pointer wherever it finds the first NULL byte. With the first string only having an allocated size of 7, adding 6 more characters will overwrite memory.

HTH,
Jon