![]() |
| About the parse evolution | |
| guest2 | 6-Nov-2008/18:05:27+1:00 |
Je reposte ma dernière version (commentée) d'async-parser en espérant que ça va donner des idées à Carl et aux autres.(sorry for my frenglish)
In this script a tested various techniques to improve
the parse dialect.
The source is a little bit tricky to read (i like compact code)
So i have to explain some things...
The main func is [parse-seek] and accept as input a file name
and a block of parsing rules.
It's an attempt to parse a stream file, only the data requested
to fill the rules are loaded.
When the rules complete or fail, the port file is closed.
[Buffer] contain the current chunck of the file.
[Getf] is the code block wich load data into the buffer when it's necessary.
Now the problem is to modify the input rules to add [getf] in the requested places.
To do this, i used [parse] again with a special set of rules called [meta-rules].
I took the opportunity to add 2 new commands in the parse dialect:
[failed] do a skip end to abort the parsing.
[geti word! integer!] set a word with an integer stored in
little endian format.
ex: geti var 1 -> load an unsigned 8 bits integer
geti var 2 -> load an unsigned 16 bits integer
geti var 3 -> load an unsigned 24 bits integer.
geti var 4 -> load a signed 32 bits integer
(see the tests below to see an usage)
When the changes are made, the block of rules is flaged
so that it can be reused by [parse-seek] without
the need to reconstruct the rules.
It is why the input rules are directly modified and not
copied before the construct.
REBOL [
file: %async-parser.r
author: guest2
]
context [
&: &&: port: stop: n: none
buffer: make binary! 5000
failed?: false
failed: [(failed?: true ) end skip]
..: func [blk] [change/part & compose/deep blk && ]
getf: [(
if n > length? buffer [
append buffer copy/part
at port index? tail buffer
max 50 n + 1 - length? buffer]
)]
set 'parse-seek func [
[throw]
file [file! url!] rules [block!]
/binary
/local result new oopen convert-string meta-rules
][
oopen: pick [open/read/seek open/read/seek/binary] not binary
convert-string: pick [
[&: binary! &&: (.. [(as-string &/1)]) :&]
[&: [string! | char!] &&: (.. [(as-binary form &/1)]) :&]
] not binary
buffer: head buffer
either 5000 < length? buffer [buffer: make binary! 5000 recycle][clear buffer]
failed?: false
unless rules/1 = 'constructed [
parse rules meta-rules: [
some [
convert-string
| &: any-string! &&: (.. [[buffer: (to-paren compose [n: (length? &/1)]) getf (&/1)]]) :& skip
| integer! 'skip &&: (.. [[buffer: (to-paren compose [n: (&/1)]) getf n skip]]) :& skip
| 'skip &&: (.. [[buffer: (to-paren [n: 1]) getf skip]]) :& skip
| 'get word! integer! &&:
(.. [[
buffer: (to-paren compose [n: (&/3)]) getf
(to-paren compose/deep [
set [(&/2)] to integer! as-binary cp/part buffer (&/3)
buffer: skip buffer (&/3)
]) :buffer
]]) :& skip
| 'failed &&: (.. [failed]) :& skip
| 'end 'skip
| ['to | 'thru] 'end
| 'thru skip &&: (.. [some [(&/2) break | skip | failed]]) :&
| 'to skip &&: (.. use [pos][[some [pos: (&/2) :pos break | skip | failed]]]) :&
| word! &&: (if find [string! char!] type?/word get/any &/1 [.. [(to-string get &/1)] &&: &]) :&&
| paren! | path! | into meta-rules
| skip
]
]
new: reduce ['constructed cp/deep rules]
clear change rules new
]
port: (do oopen file)
if error? set/any 'result try [parse/all buffer rules/constructed][close port throw result]
close port
either failed? [false][:result]
]
]
;halt ;**** DISCARD ME *****
prin "*** TEST: is this a Rebol script ? -> "
probe parse-seek/binary %async-parser.r [
"REBOL" any [" " | tab] "[" to end
| thru "^/REBOL" any [" " | tab] "[" to end
]
print "*** TEST: get size of a JPEG file ****"
print "(note it's incomplete, it currently doesn't work with all jpg files)"
if parse-seek/binary probe %IMG_8001.jpg [
#{FFD8} ; jpeg Header
[
#{FFE0} ;* JFIF header
geti len 2 ;* get block length (2 bytes)
"JFIF" ;* yeah it's a JFIF (confirmation)
(len: len - 6) len skip ;* skip this block
some [
#{FFC0} ;* good ! i found the length properties block.
2 skip ;* don't need to know the length of this block.
skip ;* filler ??? always = #{08}
geti height 2
geti width 2
break ;* finished
| #{FF} skip ;* skip the block.
geti len 2
(len: len - 2) len skip
|
failed ;* error in the format
]
| #{FFE1} ;* EXIF header
geti len 2 ;* get length of the block
;* ... TO DO
failed
]
to end
][
?? height
?? width
]
halt
| |
| guest2 | 6-Nov-2008/19:08:23+1:00 |
| Ca m'apprendra à faire des modifs de dernière minute. At line 71, replace: | 'get word! integer! &&: by | 'geti word! integer! &&: | |
|
Login required to Post. | |