作者:执子之手2502891083 | 来源:互联网 | 2024-10-11 11:26
ImimplementingacustomlexerinC++andwhenattemptingtoreadinwhitespace,theifstreamwont
I'm implementing a custom lexer in C++ and when attempting to read in whitespace, the ifstream won't read it out. I'm reading character by character using >>
, and all the whitespace is gone. Is there any way to make the ifstream keep all the whitespace and read it out to me? I know that when reading whole strings, the read will stop at whitespace, but I was hoping that by reading character by character, I would avoid this behaviour.
我正在用C ++实现一个自定义词法分析器,当试图读取空格时,ifstream将不会读取它。我正在使用>>逐字逐句阅读,所有的空白都消失了。有没有什么方法可以让ifstream保留所有的空格并将它读出来给我?我知道在阅读整个字符串时,读取将停留在空白处,但我希望通过逐字逐句阅读,我会避免这种行为。
Attempted: .get()
, recommended by many answers, but it has the same effect as std::noskipws
, that is, I get all the spaces now, but not the new-line character that I need to lex some constructs.
尝试:.get(),由许多答案推荐,但它与std :: noskipws具有相同的效果,也就是说,我现在获得所有空格,但不是我需要使用某些结构的新行字符。
Here's the offending code (extended comments truncated)
这是违规代码(扩展注释被截断)
while(input >> current) {
always_next_struct val = always_next_struct(next);
if (current == L' ' || current == L'\n' || current == L'\t' || current == L'\r') {
continue;
}
if (current == L'/') {
input >> current;
if (current == L'/') {
// explicitly empty while loop
while(input.get(current) && current != L'\n');
continue;
}
I'm breaking on the while
line and looking at every value of current
as it comes in, and \r
or \n
are definitely not among them- the input just skips to the next line in the input file.
我正在打破while行并查看当前的每个值,而\ r或\ n肯定不在其中 - 输入只是跳到输入文件中的下一行。
8 个解决方案
7
The operator>> eats whitespace (space, tab, newline). Use yourstream.get()
to read each character.
运算符>>吃空格(空格,制表符,换行符)。使用yourstream.get()读取每个字符。
Edit:
Beware: Platforms (Windows, Un*x, Mac) differ in coding of newline. It can be '\n', '\r' or both. It also depends on how you open the file stream (text or binary).
注意:平台(Windows,Un * x,Mac)在换行编码方面有所不同。它可以是'\ n','\ r'或两者。它还取决于您打开文件流(文本或二进制)的方式。
Edit (analyzing code):
编辑(分析代码):
After
while(input.get(current) && current != L'\n');
continue;
there will be an \n
in current
, if not end of file is reached. After that you continue with the outmost while loop. There the first character on the next line is read into current
. Is that not what you wanted?
如果没有到达文件末尾,则会有当前的\ n。之后,继续进行最外面的循环。在那里,下一行的第一个字符被读入当前字符。这不是你想要的吗?
I tried to reproduce your problem (using char
and cin
instead of wchar_t
and wifstream
):
我试图重现你的问题(使用char和cin而不是wchar_t和wifstream):
//: get.cpp : compile, then run: get
int main()
{
char c;
while (std::cin.get(c))
{
if (c == '/')
{
char last = c;
if (std::cin.get(c) && c == '/')
{
// std::cout <<"Read to EOL\n";
while(std::cin.get(c) && c != '\n'); // this comment will be skipped
// std::cout <<"go to next line\n";
std::cin.putback(c);
continue;
}
else { std::cin.putback(c); c = last; }
}
std::cout <
This program, applied to itself, eliminates all C++ line comments in its output. The inner while loop doesn't eat up all text to the end of file. Please note the putback(c)
statement. Without that the newline would not appear.
该程序适用于自身,它在输出中消除了所有C ++行注释。内部while循环不会占用文件末尾的所有文本。请注意回放(c)声明。没有它,换行就不会出现。
If it doesn't work the same for wifstream
, it would be very strange except for one reason: when the opened text file is not saved as 16bit char and the \n
char ends up in the wrong byte...
如果它对wifstream不起作用,那将是非常奇怪的,除了一个原因:当打开的文本文件没有保存为16位字符并且\ n字符以错误的字节结束时...