Proposed C++ optimization with big speed gains with big objects

Hi guys

Hi guys

I wana discuss proposed optimization step(optional switch?) to c++
compiller that would make old and new code in some cases order of
magnitude faster by just pure recompilation.

...After all those attempts to store anything via = without waste I
think that compiller should do "placement new" transparently for us
whenever he encounters = on movable object. Instead of forcing us to
implement zillion && operators that solve only heap side of problem.
Deciding what kind of "this" object should use is easy since static
analysis deciding what object is movable is already part of new
compillers as part of support for &&.

Skyscrapper city[1000]; // instead of temp
city[1] = Skyscrapper("Empire State Building"); // compiller should
use &city[1] as this pointer

This would fix heap and static waste = zero alloc/copy.
Why is static mem waste equally if not more important? Majority of
objects are small thus majority of their memory is static.

Benchmark results in article www codeproject
com/Articles/453022/The-new-Cplusplus-11-rvalue-reference-and-why-you

I am kinda lost in clang code but maybe somebody from you guys can
trow togethed this auto placement new a lot faster since you know your
code.
I just can't wait to see the numbers gained from untouched c++ code.
Like running benchmark code from my article but with such compiller
changes.

I love what you guys do. Finally better alternative to gcc monster.
Keep up the great opensource work.

Best Regards Ladislav [Unemployed]
neuralll[@]gmail[.]com

AMDG

Hi guys

Hi guys

I wana discuss proposed optimization step(optional switch?) to c++
compiller that would make old and new code in some cases order of
magnitude faster by just pure recompilation.

...After all those attempts to store anything via = without waste I
think that compiller should do "placement new" transparently for us
whenever he encounters = on movable object. Instead of forcing us to
implement zillion && operators that solve only heap side of problem.
Deciding what kind of "this" object should use is easy since static
analysis deciding what object is movable is already part of new
compillers as part of support for &&.

Skyscrapper city[1000]; // instead of temp
city[1] = Skyscrapper("Empire State Building"); // compiller should
use &city[1] as this pointer

This is seriously problematic in most cases.
The old state of city[1] may need to be
cleaned up. Thus, you would also have
to call city[1].~Skyscapper(), first.
then, if SkyScrapper::SkyScrapper(const char *)
throws, the state of city[1] is undefined.
There's no way to invoke the destructor
/after/ calling SkyScrapper, since that
would leave two objects at the same address
while the constructor is running. Basically,
there's no way to make this work so that
all resources are allocated and freed
correctly. This isn't even taking into
account the fact the C++ specifies exactly
how this code is supposed to behave, and
skipping the assignment operator is not licit.

In Christ,
Steven Watanabe

...The old state of city[1] may need to be
cleaned up. Thus, you would also have
to call city[1].~Skyscapper(), first.

I see this more like interesting new feature. The ability to recreate object
just by reinvoking constructor with new parameters is welcome adition at
least for me.

...if SkyScrapper::SkyScrapper(const char *)
throws, the state of city[1] is undefined.

The behavior is already defined as part of c++ iso standard in "placement
new" section.
The moment user decided to invoke constructor he agreed to loose old object
so I don't see problem there.

...There's no way to invoke the destructor
/after/ calling SkyScrapper, since that
would leave two objects at the same address
while the constructor is running.

yes there is exatly like in placement new case. since destructor handles
only dynamic members not object itself that is statically preallocated and
can't be deallocated alone.

The destructor situation is exactly the same as if some constructor throws
during any static array creation.
Only dynamic memory members can be undefined. And this situation is no
different from todays implementation of static arrays so no different
behavior there either.

I think by using the same code that placement new generates behavior will be
exactly as in c++ iso standard defines for placement new.

I wouldnt worry about =& copy or =&& move operator not being called in this
case of array element creation since that was the whole purpose of the
optimization.
That is creating in array as it should had been in c++ from the start.
Instead of usual inefficient create temp and copy (&= ) or create temp and
half copy(static mem) and half move(dynamic mem) in (=&&).

Now using such optiomization switch will be more popular since you just
recompile any old code without any change and get performance boost that
you can't get by writing zillion of duplicite && operator variants since
they solve only heap.

Thx for very insightful look at issues. If only I knew where to start and
where is = to array in clang or placement new implementation.

...The old state of city[1] may need to be
cleaned up. Thus, you would also have
to call city[1].~Skyscapper(), first.

I see this more like interesting new feature. The ability to recreate object
just by reinvoking constructor with new parameters is welcome adition at
least for me.

An "interesting new feature" is one thing, doing it behind the scenes as an optimization is completely different. (Not that I agree with this being interesting.)

...if SkyScrapper::SkyScrapper(const char *)
throws, the state of city[1] is undefined.

The behavior is already defined as part of c++ iso standard in "placement
new" section.
The moment user decided to invoke constructor he agreed to loose old object
so I don't see problem there.

The user did no such thing. The user decided to create a temporary and then move-assign it to the old position. You want the compiler to decide for him that instead there will be a placement new without calling the destructor.
Which, by the way, is in no way defined by the C++ standard. You're overwriting an object without properly ending its life. You're misusing an object as raw memory. You're invoking undefined behavior. End of story.

If you want to do that in your code, that's your business, and C++ won't stop you from explicitly using placement new. Don't ask the compiler to do it for you, because that is just insanely dangerous.

...There's no way to invoke the destructor
/after/ calling SkyScrapper, since that
would leave two objects at the same address
while the constructor is running.

yes there is exatly like in placement new case. since destructor handles
only dynamic members not object itself that is statically preallocated and
can't be deallocated alone.

The destructor is there to perform cleanup of any kind. Maybe unregister the object in some global map. You don't call the destructor, you've got a problem.
Placement new requires raw memory.

The destructor situation is exactly the same as if some constructor throws
during any static array creation.

No.

Only dynamic memory members can be undefined.

That doesn't even mean anything.

  And this situation is no
different from todays implementation of static arrays so no different
behavior there either.

Wrong.

I think by using the same code that placement new generates behavior will be
exactly as in c++ iso standard defines for placement new.

Yes, indeed. It will be just as undefined as if you placement new over an existing object.
If you want to do that, use placement new. Don't expect perfectly valid code to be transformed to invalid code.

I wouldnt worry about =& copy or =&& move operator not being called in this
case of array element creation since that was the whole purpose of the
optimization.

An optimization is a code transformation that preserves behavior. The cases where we could prove that your transformation preserves behavior are those that are so trivial that inlining, value propagation and dead store elimination will have exactly the same effect anyway.

That is creating in array as it should had been in c++ from the start.
Instead of usual inefficient create temp and copy (&= ) or create temp and
half copy(static mem) and half move(dynamic mem) in (=&&).

C++ has lifetime rules for objects. Those are there for a reason.

Now using such optiomization switch will be more popular since you just
recompile any old code without any change and get performance boost that
you can't get by writing zillion of duplicite && operator variants since
they solve only heap.

And as a side effect, you get subtle bugs. Yay!

Sebastian

The user did no such thing. The user decided to create a temporary and
then move-assign it to the old position.

No the user said let there be 1000 static objects
and let first one be initialized with those parameters.
The fact the nothing in code says how this goal is reached is good thing
since it allows us to do it in efficient way.

You want the compiler to decide
for him that instead there will be a placement new without calling the
destructor.

No I want this destructor be called and behave exactly like placement new
part of standard. ie deleting only members.

Which, by the way, is in no way defined by the C++ standard.

One optimization switch speeding all your old code without any change will
be well worth it for many since it will produce faster and smaller code than
zillion heap only && operators copy pasted everywhere.

You're
overwriting an object without properly ending its life. You're misusing
an object as raw memory.
You're invoking undefined behavior. End of story.

No behavior and memory usage of basic object / members / destructors etc.
is exactly as defined in standard for placement new. So whole behavior is
exactly the same since copy of placement new code is used to implement it.

If you want to do that in your code, that's your business, and C++ won't
stop you from explicitly using placement new. Don't ask the compiler to
do it for you, because that is just insanely dangerous.

I don's see the danger there any more than with placement new which is
already part of clang and standard.

And as a side effect, you get subtle bugs. Yay!

Can you make some example?

but I completely understand your cautious approach. And I understand that any
change must be thoroughly tested thats Why I actually wana implement and run
thru tests that are part of clang.
If only for fun :).

Can anyone being more deep in clang please point me in the right diraction
in how sould I go about implementing it in clang ?
Mainly pointing me in right direction would save me a zillion of time.
In which source method in clang processes assign in city[1] =
SkyScrapper("building") and placement new ?
I am lost in sources due to generic names. Thx for any tip. some gcc guys
wanted me to do gcc c plugin but compiling gcc is nightmare so I prefer
testing in clang for now.

He already did. The problem is that your optimization changes what the
standard says will happen. Under normal circumstances,

Obj arr[10];
arr[0] = Obj();

Will be equiv to this:
Obj o;
arr[0] = move(o);

So if Obj's constructor fails, arr[0] is intact. Additionally,
operator= gets to handle any exception safety requirements. With your
optimization, this changes into

arr[0].~Obj();
new(&arr[0]) Obj()

So you see, exception safety can't be had anymore -- if the Obj()
constructor fails, you've got a destructed object and no way to
recover it.

int a=1;
a=2;

The moment I agreed to assign new value I agreed to old value being lost.
This is normal and expected behavior.

If you wana preserve old value for whatever reason (construction failure)
you keep old value in temp like you do for everything else.

Advantage is you know what is going on.
You know that redundant copy is being kept and you can use it as you wish;
But it's big performance difference of preserving state of objects we 99% of
time don't care about in this particular case.

Skyscrapper city[1000], tmp=city[1]; // 1000 in this case useless default
constructors doing nothing more that specialized one does not do too

city[1]=Skyscrapper("Empire"); //exception occurs so we just try again or
restore to tmp in usual exception handler

in this case which is prety much most of the static array creation loops
restoring to tmp doesn't make sense so thank god we can skip it and ged rid
of useless copy

The problem you face is that all the code in the world is written
against the standard, not against your scary optimization.

That means that if you have your "optimization" enabled in a TU, _all_
the code that is pulled into it must be aware of the standard-violation
that your optimization performs.

The core concern you have when implementing an optimization is not
optimizing the cases you care about. It's making sure it doesn't screw
over all the other cases.

It's completely irrelevant if it may make some explicitly desired cases
better, if it breaks the rest of the world indiscriminately.

I'm willing to bet three pinecones that if you implemented this, large
chunks of code would start failing, including the standard libraries
used.

the idea was to enable it just for first initialization. ie we mark array
elements last constructor was default one
knowing this we can return object to previous state like it was previous
copy case.

Skyscrapper city[1000];
city[1] = Skyscrapper("Empire State Building");

              city[1].~Skyscraper(); // release member resources aquired by
default constructor
try new (&city[1]) Skyscraper("empire state building"); catch(...) { //
cleanup
     new (&city[1]) Skyscraper() // reaquire member resources aquired by
default constructor
     throw;
}

this will make code behave like in standard. ie
in exception handler you get object in state after default initialization
yet preventing default copy.

yes state machine will change. ie if you were doing let's say gets(from
file) in default constructor. you will get change in code logic since one
more default constructor was invoked once more.
but this is bad design.

I think if frist line of documentation of such switch will be
"by using this switch default constructor will be recalled during exception
to restore object to previous state"

Now about how relevant is this previous object state when exceptions start
flying around from constructors ?

statistically 99% of exception in constructor will happen due to memory
exhaustion(we are allready doomed) .
statistically 99% of member pointers in code around you were not smart
pointers (we are allready doomed)

So I think it shall be tested since advantages massively outweight pretty
much hypotetical disadvantages.

Me too, but when I write

arr[0] = Obj();

I expect the assignment be performed after the constructor invocation, so if it failed, I expect no assignment, so nothing lost.

-- Jean-Daniel

For the value yes, but not for the destructor call.

If my understanding is correct, if I have this kind of code:

class TextFile
{
public:
TextFile( std::string path )
{
m_file.open( path );
// add checks here
}

~TextFile()
{
m_file.close();
}

TextFile& operator=( const TextFile& other )
{
// clear our file’s content
m_file.clear();
// copy the content of other’s file
m_file.write( other.m_file );
}

private:

FileHandler m_file;

};

Then the standard guarantee me that the destructor will be called only if the instance goes out of scope, not on copy or move.
If I understand correctly what you propose, this code wouldn’t do what I assume it would. It would call the destructor and then change the semantic of the type.

Joel Lamotte

"Bad design" is subjective. Writing code against the standard isn't.

Actually the sample you provided will work without problem.
Since switch works only on first assign(after default constructor)

TextFile textfile[1000]; // default constructor doing nothing

textfile[1]=TextFile("c:\test1.txt"); // = we call destructor doing nothing
and placement new
textfile[1]=TextFile("c:\test1.txt"); // standard temp copy + operator combo

But I see the point if you put code logic to default constructor then yes
first assign will not happen. You use switch to remove copy from arrays
init. So yes actual code change is needed in this statistically marginal
case.

Anyway code logic (opening files etc) in default constructors is BLASPHEMY !
:smiley: such people actually deserve to being punished.
Punishing them a bit more is actually good thing ;D Lets call it feature

Hmm... Now that I think of it wouldnt skiping optimization for objects having
any function call except memset from default constructor solve whole
roinvoke of default constructor problem ? Being statisticaly marginal should
not be noticable in overal performance gain either

Hmm... Now that I think of it wouldnt skiping optimization for objects having
any function call except memset from default constructor solve whole
roinvoke of default constructor problem ? Being statisticaly marginal should
not be noticable in overal performance gain either

Now about how relevant is this previous object state when

    > exceptions start flying around from constructors ?

    > statistically 99% of exception in constructor will happen
    > due to memory exhaustion(we are allready doomed) .
    > statistically 99% of member pointers in code around you
    > were not smart pointers (we are allready doomed)

    > So I think it shall be tested since advantages massively
    > outweight pretty much hypotetical disadvantages.

This is a side effect of the tyranny of democracy. :slight_smile:

So, if you want to do this kind of optimization with soundness, you will
have to program some complex interprocedural analysis, that will be
often intractable anyway. :frowning:

So, before setting yet-another-religion war, why not considering a
perfect use case for some #pragma ?

- you want to solve an issue ?

- this is too complex to figure out in general ?

- but the programmer knows when it does apply...

- insert the #pragma (may be in one of the constructor declaration ?)

You just have to launch your optimization when the pragma is set.

From the norm:

16.6 Pragma directive [cpp.pragma]

A preprocessing directive of the form
  # pragma pp-tokensopt new-line
causes the implementation to behave in an implementation-defined manner. The behavior might cause
translation to fail or cause the translator or the resulting program to behave in a non-conforming manner.
Any pragma that is not recognized by the implementation is ignored.

So the last sentence is fine.
If your compiler do not deal with your optimization, that is not worse
than before.

Please, choose a sensible name-space for the pragma to avoid
conflicts. :slight_smile:
(I am traumatized by the #pragma simd or #pragma vector found in the
compiler from a well-known processor vendor and others :frowning: ).

Skyscrapper city[1000];
city[1] = Skyscrapper("Empire State Building");

              city[1].~Skyscraper(); // release member resources aquired by
default constructor
try new (&city[1]) Skyscraper("empire state building"); catch(...) { //
cleanup
     new (&city[1]) Skyscraper() // reaquire member resources aquired by
default constructor
     throw;
}

this will make code behave like in standard. ie
in exception handler you get object in state after default initialization
yet preventing default copy happening even when things are fine.
Optimization will skip the objects calling anything except memset from
default constructor to solve reinvocation state machine change.

No.

Sebastian

I’m sorry I don’t understand at all what you mean here.
I’m saying, taking the exact code I proposed, not changing it by adding a default constructor, this expression:

TextFile a (“a.txt”);
TextFile b (“b.txt”);
a = b; //copy the content of b.txt in a.txt

Would have, using your change, a different semantic than what the standard propose because there would be file closing and opening where I didn’t want to.
The type is specifically written to avoid closing and opening files until the instances are really out of scope or explicitely deleted.

Or did I misunderstand your proposal?

Joel Lamotte