Splitting a file into multiple files based on trigger words in the first column.

1视图(最近30天)
I've got a large set of data (.dat) that I need to split whenever a specific text string is mentioned. For example, I've got:
Dataset_1_1 Set Number
1234 1234
.... ....
Dataset_1_2 Set Number2
5678 5678
.... ....
[I need to make the split here]
Dataset_2_1 Set Number
1234 1234
.... ....
Dataset_2_2 Set Number2
5678 5678
.... ....
等等等。我需要将所有“ dataset_1”集保持在一起,含义“ dataset_1_1”需要与“ dataset_1_34”一起使用,但是只要检测到/读取“ dataset_2_1”,就需要立即进行拆分。不幸的是,“ dataset_1”和“ dataset_2”之间的行数(数百万行)和每个数据集之间的数量不同,因此我需要主要根据名称将其分配。
Can Matlab "read" the first column of lines, find where "Dataset_1_1", "Dataset_2_1", "Dataset_3_1", etc. is and split them at those points and then save each to a new dat file?

Answers (1)

dpb
dpb on 2 Aug 2021
编辑:dpb on 2 Aug 2021
This is where a filter is probably best given size of file and unknown numbers between sections...and since don't need to have anything but a single record at a time...
fid=fopen('inputfile.dat');
fnum = 1;
fout=compose("Dataset%04d.dat",fnum);% initial output file
fod=fopen(fout,'w');百分比打开写作
fnum=fnum+1;% ready for next file
linechk=compose("Dataset_%d",fnum);% next set indicator string
尽管~feof(fid)
l=fgets(fid);% get input line w/ \n
if包含(L,Linechk)% found the new test record
fod=fclose(fod);% close the finished test file
fout=compose("Dataset%04d.dat",fnum);% next output file
fod=fopen(fout,'w');百分比打开写作
fnum=fnum+1;% get ready for next file
linechk=compose("Dataset_%d",fnum);% next set indicator string
end
fprintf(fod,'%s',l);% echo line from input to output file
end
fclose('all')% close both files
"Air code", untested but think it's close...

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

开始狩猎!