Wait the light to fall

使用 %% 提取文本块儿

焉知非鱼

数据样例 #

section.txt 中的本文为样例数据:

123,456,789
=begin code
999,333,666
145,123,120
=end code
10,20,30
10,10,10
=begin code
567,555,578
678,679,665
710,720,715
=end code
321,654,987
=begin code
312,555
=end code

要求把 =begin code=end code 之间的所有数字分别提取出来。

Grammar #

Grammar 的结构如下, 其中 Section 目录下分别是 Grammar 和 Action 模块, data 目录下是样例数据 section.txt:

├── Section
│   ├── Actions.pm6
│   └── Grammar.pm6
├── data
│   ├── section.txt
├── extract-section.p6
use Grammar::Debugger;
use Grammar::Tracer;

unit grammar Section::Grammar;

token TOP {
   ^  <section>+ %% <separator> $
}

token section {
    <line>+
}

token line {
   ^^ [\d+]+ %% ',' $$ \n
}

token separator {
    |  ^^ '=begin code' $$ \n
    |  ^^ '=end code' $$ \n*
}

其中 Grammar::DebuggerGrammar::Tracer 模块用于调试 grammar, 需要放在 grammar 模块的行首:

use Grammar::Debugger;
use Grammar::Tracer;

Action #

unit class Section::Actions;

method TOP($/) {
     make $/.values».made;
}

method section($/) {
    make ~$/.trim;
}

method line($/) {
    make ~$/.trim;
}

method separator($/) {
    make Empty;
}

解析 #

不使用 Action #

use lib '.';
use Section::Grammar;

my $parsed = Section::Grammar.parsefile(@*ARGS[0] // 'data/section.txt');
.Str.say for $parsed<section>;

输出 #

123,456,789

999,333,666
145,123,120

10,20,30
10,10,10

567,555,578
678,679,665
710,720,715

321,654,987

312,555

使用 Action #

use lib '.';
use Section::Grammar;
use Section::Actions;

my $parsed = Section::Grammar.parsefile(
    @*ARGS[0] // 'data/section.txt',
    :actions(Section::Actions)
).made;

.Str.say for @$parsed;

输出 #

123,456,789
999,333,666
145,123,120
10,20,30
10,10,10
567,555,578
678,679,665
710,720,715
321,654,987
312,555