Skip to main content
17 votes
Accepted

How does the GLL parsing algorithm work?

The reason LL grammars can't handle left-recursion or ambiguity is because those are both situations where the grammar has to make a choice based on something that it won't discover until the future. ...
davidbak's user avatar
  • 762
16 votes

How would you test a lexer?

If you're writing the lexer yourself, this seems like an ideal case for test-driven development. While “the number of combinations of tokens in a source file can be huge,” the number of branches in ...
Arseni Mourzenko's user avatar
16 votes
Accepted

How would you test a lexer?

Your grammar probably has some rules for each token on how it can be produced (for example, that a { signifies a BLOCK_START token, or that a string-literal token is delimited by " characters). ...
Bart van Ingen Schenau's user avatar
15 votes
Accepted

What are the 'practical' advantages of LR parser over LL parser 'in today'?

True LL parsers are fast and simple – but also completely unable to parse many interesting languages. Many real-world programming languages are in the LALR class, which is closely related to LR but ...
amon's user avatar
  • 136k
15 votes

Rewrite or Transpiler - How to move away from a proprietary SAAS solution

Let's me take on your issues one-by-one: the current implementation has bugs Yep, and when you transpile such a code, what makes you think those bugs will not be transpiled as well? With a rewrite, ...
Doc Brown's user avatar
  • 220k
12 votes

How to Unit-Test a parser of a file?

I want to write tests for it. What you are intending to test? I want to use TDD. I'm refactoring a parser and want to test the 'parse()' method. So the aim is to clean things up. I would argue that ...
Arnis Lapsa's user avatar
11 votes
Accepted

Should my lexer allow what is obviously a syntax error?

Your lexer is never going to be able to diagnose all syntax errors unless you make it as powerful as the parser itself. This would be a large and totally unnecessary amount of work, and the only ...
Kilian Foth's user avatar
10 votes

How can one interpret an Abstract Syntax Tree without recursion?

You can systematically calculate such an approach. For simplicity, I'll consider evaluating arithmetic expressions, but the transforms are completely mechanical. I'll use Haskell as the language. data ...
Derek Elkins left SE's user avatar
10 votes
Accepted

C++ Recursive Descent Parser: Global Variable Dilemma

The group of methods which belong to one parser are tightly coupled because of the nature of the problem. The parser has some state to manage which is shared between all those functions. So putting ...
Doc Brown's user avatar
  • 220k
10 votes
Accepted

Reasons to use (and not to use) a repeated delimiter to escape that delimiter?

It seems the parser would need to recognize the ', but then conditionally undo that decision based on the next character. While this a fairly minor check, it seems strange to me given the alternative ...
Flater's user avatar
  • 59.5k
10 votes

Is it possible to make a compiler for any dynamic/script/interpreter language

Can we (Is it possible to) make a compiler for any dynamic/script/interpreter language, Yes, it is. Note that there is no such thing as an "interpreter" language. An interpreter is a ...
Jörg W Mittag's user avatar
9 votes

How to Unit-Test a parser of a file?

This approach may work but as far as I understand from unit test methodology, unit-tests shouldn't perform I/O I think you are overlooking the elephant in the room here: your parser should not perform ...
nvoigt's user avatar
  • 9,230
9 votes

Is it possible to make a compiler for any dynamic/script/interpreter language

Yes, it is possible to create a “compiler” for any language, no matter how dynamic. But this isn't necessarily a useful thing to do. 1. Many languages like Ruby, Python, and JavaScript are extremely ...
amon's user avatar
  • 136k
8 votes

Should a SVG be converted to JSON to be stored in the database?

An other software engineer in my team told me to use JSON but couldn't substantiate why? Maybe it was a joke? I can't think of any benefit to storing SVG inside JSON. None. I would store SVG files......
Dan Wilson's user avatar
  • 3,163
8 votes
Accepted

Is a literal out of range a syntax error or a semantic error?

It depends on the grammar of the language. If the grammar precludes recognizing 256 or 257, then those character sequences are syntax errors — that character sequence is outside of what can be ...
Erik Eidt's user avatar
  • 34.8k
7 votes
Accepted

Meaningful response to the user after his uploaded CSV was processed?

This depends on the way the CSV is created, maintained or fixed in case of errors, and on how the imported data will be processed after. Let's start with the question of "all or nothing" vs. "reject ...
Doc Brown's user avatar
  • 220k
7 votes
Accepted

How do different file types generally store data?

The reason there are many different file formats is that there are many different goals for the way data is formatted. Some of these are in opposition to each other and some are orthogonal to each ...
JimmyJames's user avatar
  • 30.9k
7 votes
Accepted

Are parsers a special case in unit testing?

Parsers are not a special case, what you desribe is somewhat common. If the class in question is of a reasonable size, adheres to seperation of concerns, single-responsibility principle and so on, ...
JacquesB's user avatar
  • 62.3k
7 votes
Accepted

Backend - own database vs live parsing

I strongly recommend using a database with a well-defined schema. Why? Independence/Decoupling. With a database, things are under your control, not theirs. Lets look at the pros and cons of each ...
marstato's user avatar
  • 4,638
7 votes

Should a SVG be converted to JSON to be stored in the database?

Neither. You should store the model the application uses to generate the svg. ie for a network diagram the various nodes and joins. That way when you change the svg generation code the old models ...
Ewan's user avatar
  • 84.4k
7 votes
Accepted

How can one interpret an Abstract Syntax Tree without recursion?

You need an operand stack (to store intermediate values) as well as a call stack where you place something to remind you what to do next when you return from a sub-activity. Most CPUs mix those two ...
Hans-Martin Mosner's user avatar
7 votes
Accepted

Is it wrong to parse YAML `true` as a string?

YAML is an extremely complicated format, and the result of that document may differ between YAML versions and parsers. YAML itself does not prescribe a data model beyond scalars/sequences/mappings/...
amon's user avatar
  • 136k
7 votes

How would you test a lexer?

One alternative that others aren't mentioning, is to use a test generative approach—like QuickCheck from Haskell—to generate the edge cases from the grammar you've defined. Now, once they're generated,...
A T's user avatar
  • 761
6 votes
Accepted

Which is a more efficient approach to decoding escape sequences in text?

Most parsers decode string escapes during tokenization. Note that the tokenizer needs to be aware of escapes anyway to determine the end of a string, since the string delimiter " itself may be escaped....
amon's user avatar
  • 136k
6 votes

How exactly is an Abstract Syntax Tree created?

I know this question is 4+ years old but I feel I should add a more detailed answer. Abstract Syntax Trees are created no differently from other trees; the more true statement in this case is that ...
Nergal's user avatar
  • 313
6 votes
Accepted

AST design: Call is both expression and statement?

The AST is a bridge between the concrete syntax and the semantic model of your language. It does not have to conform exactly to the syntax. In particular, it is generally OK if the AST could ...
amon's user avatar
  • 136k
6 votes
Accepted

Is there a name for these diagrams that show the flow of a valid value

They are called railroad diagrams or syntax diagrams. Syntax diagrams (or railroad diagrams) are a way to represent a context-free grammar. They represent a graphical alternative to Backus–Naur form ...
Toon Krijthe's user avatar
  • 2,624
6 votes

How does the GLL parsing algorithm work?

Since I don't have enough reputation to comment I am dropping this as an answer. Adrian Johnstone (the co-author of the GLL parsing paper) has an implementation of the GLL parser generator in his tool ...
Venkatesh Prasad Kannan's user avatar
6 votes

What type of syntax notation is this?

It is a type of syntax diagram called a railroad diagram. See also here, here, and here. Thanks to @davidbak who found the answer.
user497745's user avatar
6 votes

Fail fast is brittle

For situations where some data constellation indicates a bug in code, "fail fast" is usually the only viable a strategy - the program shall be stopped, and the programmer or vendor needs to ...
Doc Brown's user avatar
  • 220k

Only top scored, non community-wiki answers of a minimum length are eligible