Character cr slow. Alternatives?

VA Smalltalk is a "100% VisualAge compatible" IDE that includes the original VisualAge technology and the popular VA Assist and WidgetKit add-ons.

Moderators: Eric Clayberg, wembley, tc, Diane Engles, solveig

Character cr slow. Alternatives?

Postby TriSebastian » Tue Mar 08, 2011 8:05 pm

Hi all!

I need to parse and process a long files and need to extract single lines.

I implemented Onward- and Backward-Filereaders based on CfsFileDescriptor. Let's say,.. I'm not opening a file and create a CfsReadStream from it, I just work with the filepointers.
This implementation increased the performance from 1.5 hours to 8 to 12 minutes and the memory stays low,... wich is critical with my large files...

The PerformanceWorkbench show sme quite clearly that the usual substrings: method is also quite slow. I implemented my own one which just checks for Character cr, instead of checking for all seperators,... this brought me down to under 4 minutes.
But PerformanceWorkbench now still tells me that 37% of my time get's lost in the method Character cr.
I tried to check my chars beeing equal '
'
but this is answers "false" checking '
' = Character cr

Is there another way to find carriage returns in a String other than checking for Cr?

Thanks for ideas and help!
Sebastian
TriSebastian
 
Posts: 76
Joined: Sun Jul 20, 2008 9:40 pm
Location: Nanaimo, BC, Canada

Re: Character cr slow. Alternatives?

Postby TriSebastian » Tue Mar 08, 2011 8:26 pm

:-D

I guess it is too late,... I moved the call of Character cr to the caling method and pass the seperator as parameter,... now everything is fine,...
the called method is slow ;-)

And this is no wonder,....

ESString:>>#findDelimiters: delimiters startingAt: startIndex
and
ESString>>#skipDelimiters: delimiters startingAt: endIndex

I tweaked them, but not enough as it seems...

Benchmark Workshop rocks!

@tc,... it would be a great improvment of VASmalltalk id e.g. the HierarchyBrowser had a Benchmark integration! Just select/mark a method and trace the performance just like in the code coverage tool...

Sorry, but I'll go to bed now ;-)
Have a look at this great tool! It's worth it!
Sebastian
TriSebastian
 
Posts: 76
Joined: Sun Jul 20, 2008 9:40 pm
Location: Nanaimo, BC, Canada

Re: Character cr slow. Alternatives?

Postby wembley » Wed Mar 09, 2011 6:47 am

Sebastian -

You're working too late at night :)

'
' = Character cr


will always return false for 2 reasons:

1) You are comparing a string to a character
2) The string contains 2 characters (Cr and Lf) on Windows

Character cr should be really fast since it is just returning a literal (2 bytecodes: pushLiteral, returnTopOfStack).
John O'Keefe [|], Principal Smalltalk Architect, Instantiations Inc.
wembley
Moderator
 
Posts: 405
Joined: Mon Oct 16, 2006 3:01 am
Location: Durham, NC

Re: Character cr slow. Alternatives?

Postby TriSebastian » Wed Mar 09, 2011 7:26 am

Hi John,

I guess I was confused by the kind of representation in the Performance Workbench.
It selected the cr and even showed a high percentage there . That's why I've also been confused that Cr should be slow....

But I changed the codeing and now, it's really the string delimitor search loop which is slow. Or better say needs a lot of my processing time, because it is often called.

Yes, in deed, I forgot the asString,... but thank you the lf is still missing....

Hmm... mybe I decrease my Stringbuffer from size 1024 to 512 this might shorten the time spent in the delimitor search loop. Hope those more file read accesses won't kill the hard disk earlier ;-) But in the end it's not mine :-D

Thank you for the lf hint. I'll stay with the Character cr. The asString and additional lf checks might perform even worser.

CU at CampSmalltalk!
Sebastian
TriSebastian
 
Posts: 76
Joined: Sun Jul 20, 2008 9:40 pm
Location: Nanaimo, BC, Canada


Return to VA Smalltalk 7.0, 7.5 & 8.0

Who is online

Users browsing this forum: No registered users and 1 guest

cron