Refactor to use statediff plugin #1
12
README.md
12
README.md
@ -125,8 +125,8 @@ Config format:
|
||||
* Combine output from multiple workers and copy to post-processed output directory:
|
||||
|
||||
```bash
|
||||
# public.blocks
|
||||
cat {output_dir,output_dir/*}/public.blocks.csv > output_dir/processed_output/combined-public.blocks.csv
|
||||
# ipld.blocks
|
||||
cat {output_dir,output_dir/*}/ipld.blocks.csv > output_dir/processed_output/combined-ipld.blocks.csv
|
||||
|
||||
# eth.state_cids
|
||||
cat output_dir/*/eth.state_cids.csv > output_dir/processed_output/combined-eth.state_cids.csv
|
||||
@ -144,8 +144,8 @@ Config format:
|
||||
* De-duplicate data:
|
||||
|
||||
```bash
|
||||
# public.blocks
|
||||
sort -u output_dir/processed_output/combined-public.blocks.csv -o output_dir/processed_output/deduped-combined-public.blocks.csv
|
||||
# ipld.blocks
|
||||
sort -u output_dir/processed_output/combined-ipld.blocks.csv -o output_dir/processed_output/deduped-combined-ipld.blocks.csv
|
||||
|
||||
# eth.header_cids
|
||||
sort -u output_dir/processed_output/eth.header_cids.csv -o output_dir/processed_output/deduped-eth.header_cids.csv
|
||||
@ -171,8 +171,8 @@ Config format:
|
||||
# public.nodes
|
||||
COPY public.nodes FROM '/output_dir/processed_output/public.nodes.csv' CSV;
|
||||
|
||||
# public.blocks
|
||||
COPY public.blocks FROM '/output_dir/processed_output/deduped-combined-public.blocks.csv' CSV;
|
||||
# ipld.blocks
|
||||
COPY ipld.blocks FROM '/output_dir/processed_output/deduped-combined-ipld.blocks.csv' CSV;
|
||||
|
||||
# eth.header_cids
|
||||
COPY eth.header_cids FROM '/output_dir/processed_output/deduped-eth.header_cids.csv' CSV;
|
||||
|
@ -2,13 +2,13 @@
|
||||
|
||||
* For a given table in the `ipld-eth-db` schema, we know the number of columns to be expected in each row in the data dump:
|
||||
|
||||
| Table | Expected columns |
|
||||
| ----------------- |:----------------:|
|
||||
| public.nodes | 5 |
|
||||
| public.blocks | 3 |
|
||||
| eth.header_cids | 16 |
|
||||
| eth.state_cids | 8 |
|
||||
| eth.storage_cids | 9 |
|
||||
| Table | Expected columns |
|
||||
|--------------------|:----------------:|
|
||||
| `public.nodes` | 5 |
|
||||
| `ipld.blocks` | 3 |
|
||||
| `eth.header_cids` | 16 |
|
||||
| `eth.state_cids` | 8 |
|
||||
| `eth.storage_cids` | 9 |
|
||||
|
||||
### Find Bad Data
|
||||
|
||||
@ -29,7 +29,7 @@
|
||||
```bash
|
||||
./scripts/find-bad-rows.sh -i eth.state_cids.csv -c 8 -o res.txt -d true
|
||||
```
|
||||
|
||||
|
||||
Output:
|
||||
|
||||
```
|
||||
@ -40,7 +40,7 @@
|
||||
|
||||
```bash
|
||||
./scripts/find-bad-rows.sh -i public.nodes.csv -c 5 -o res.txt -d true
|
||||
./scripts/find-bad-rows.sh -i public.blocks.csv -c 3 -o res.txt -d true
|
||||
./scripts/find-bad-rows.sh -i ipld.blocks.csv -c 3 -o res.txt -d true
|
||||
./scripts/find-bad-rows.sh -i eth.header_cids.csv -c 16 -o res.txt -d true
|
||||
./scripts/find-bad-rows.sh -i eth.state_cids.csv -c 8 -o res.txt -d true
|
||||
./scripts/find-bad-rows.sh -i eth.storage_cids.csv -c 9 -o res.txt -d true
|
||||
@ -66,7 +66,7 @@
|
||||
|
||||
```bash
|
||||
./scripts/filter-bad-rows.sh -i public.nodes.csv -c 5 -o cleaned-public.nodes.csv
|
||||
./scripts/filter-bad-rows.sh -i public.blocks.csv -c 3 -o cleaned-public.blocks.csv
|
||||
./scripts/filter-bad-rows.sh -i ipld.blocks.csv -c 3 -o cleaned-ipld.blocks.csv
|
||||
./scripts/filter-bad-rows.sh -i eth.header_cids.csv -c 16 -o cleaned-eth.header_cids.csv
|
||||
./scripts/filter-bad-rows.sh -i eth.state_cids.csv -c 8 -o cleaned-eth.state_cids.csv
|
||||
./scripts/filter-bad-rows.sh -i eth.storage_cids.csv -c 9 -o cleaned-eth.storage_cids.csv
|
||||
|
Loading…
Reference in New Issue
Block a user