Revisions to Lazy, Functional Parser Combinators

added 492 characters in body

Source Link

edited Sep 1, 2020 at 1:58

2.1k
15
32

Bug: singleton objects have no allocation record

Since the garbage collector will try to set the mark() in a SYMBOL object, the T_ object needs a dummy allocation record. NIL_ doesn't need one since an INVALID object will not get marked.

pc9obj.c:

object T_ = &(1[(union uobject[]){{ .t = 0 },{ .Symbol = { SYMBOL, T, "T" } }}]),
       NIL_ = (union uobject[]){{ .t = INVALID }};

Bug: using object fields for non-object data

In the pprintf() and pscanf() functions, the object field in OPERATOR objects sometimes contains a va_list *! The garbage collector might fiddle with the memory around this address if it tries to set the (non-existant) mark(). The copious (void *) casts are a code smell. Better to use the VOID type object to hold this pointer.

Missing functions

There's some for 1 or more, many for 0 or more, maybe for 0 or 1. But there's no function to match n times, or n or more, or n up to m times — these kind of quantifiers.

Poor namespacing for internal symbols

enum parser_symbols {
  VALUE = SYM1, PRED, P, PP, NN, Q, R, FF, XX, AA, ID, USE, ATOM,
  SYM2
};

What are P, PP, NN, Q, R, FF, XX, AA? VALUE PRED and ATOM are better but still kinda vague.

Short-circuit tests (and maybe actually test stuff)

int par_main(){
  return 
      obj_main(),
      test_env(), test_parsers(),
      test_regex(),
          test_pprintf(),
          test_pscanf(),
          0;
}

Bonus formatting error. Better to short-circuit the tests based on the return values.

int par_main(){
  return  0
      ||  obj_main()
      ||  test_env()
      ||  test_parsers()
      ||  test_regex()
      ||  test_pprintf()
      ||  test_pscanf()
      ||  0;
}

Then the testing functions can return non-zero to stop producing output.

No error reporting

A syntax error during parsing will result in an empty list being returned. Graham Hutton's paper describes how to rewrite the basic parser combinators so that meaningful error messages can be produced -- without using Monad Transformers which is the more typical way in functional languages.

Bug: singleton objects have no allocation record

Since the garbage collector will try to set the mark() in a SYMBOL object, the T_ object needs a dummy allocation record. NIL_ doesn't need one since an INVALID object will not get marked.

pc9obj.c:

object T_ = &(1[(union uobject[]){{ .t = 0 },{ .Symbol = { SYMBOL, T, "T" } }}]),
       NIL_ = (union uobject[]){{ .t = INVALID }};

Bug: using object fields for non-object data

In the pprintf() and pscanf() functions, the object field in OPERATOR objects sometimes contains a va_list *! The garbage collector might fiddle with the memory around this address if it tries to set the (non-existant) mark(). The copious (void *) casts are a code smell. Better to use the VOID type object to hold this pointer.

Missing functions

There's some for 1 or more, many for 0 or more, maybe for 0 or 1. But there's no function to match n times, or n or more, or n up to m times — these kind of quantifiers.

Poor namespacing for internal symbols

enum parser_symbols {
  VALUE = SYM1, PRED, P, PP, NN, Q, R, FF, XX, AA, ID, USE, ATOM,
  SYM2
};

What are P, PP, NN, Q, R, FF, XX, AA? VALUE PRED and ATOM are better but still kinda vague.

Short-circuit tests (and maybe actually test stuff)

int par_main(){
  return 
      obj_main(),
      test_env(), test_parsers(),
      test_regex(),
          test_pprintf(),
          test_pscanf(),
          0;
}

Bonus formatting error. Better to short-circuit the tests based on the return values.

int par_main(){
  return  0
      ||  obj_main()
      ||  test_env()
      ||  test_parsers()
      ||  test_regex()
      ||  test_pprintf()
      ||  test_pscanf()
      ||  0;
}

Then the testing functions can return non-zero to stop producing output.

Bug: singleton objects have no allocation record

Since the garbage collector will try to set the mark() in a SYMBOL object, the T_ object needs a dummy allocation record. NIL_ doesn't need one since an INVALID object will not get marked.

pc9obj.c:

object T_ = &(1[(union uobject[]){{ .t = 0 },{ .Symbol = { SYMBOL, T, "T" } }}]),
       NIL_ = (union uobject[]){{ .t = INVALID }};

Bug: using object fields for non-object data

In the pprintf() and pscanf() functions, the object field in OPERATOR objects sometimes contains a va_list *! The garbage collector might fiddle with the memory around this address if it tries to set the (non-existant) mark(). The copious (void *) casts are a code smell. Better to use the VOID type object to hold this pointer.

Missing functions

There's some for 1 or more, many for 0 or more, maybe for 0 or 1. But there's no function to match n times, or n or more, or n up to m times — these kind of quantifiers.

Poor namespacing for internal symbols

enum parser_symbols {
  VALUE = SYM1, PRED, P, PP, NN, Q, R, FF, XX, AA, ID, USE, ATOM,
  SYM2
};

What are P, PP, NN, Q, R, FF, XX, AA? VALUE PRED and ATOM are better but still kinda vague.

Short-circuit tests (and maybe actually test stuff)

int par_main(){
  return 
      obj_main(),
      test_env(), test_parsers(),
      test_regex(),
          test_pprintf(),
          test_pscanf(),
          0;
}

Bonus formatting error. Better to short-circuit the tests based on the return values.

int par_main(){
  return  0
      ||  obj_main()
      ||  test_env()
      ||  test_parsers()
      ||  test_regex()
      ||  test_pprintf()
      ||  test_pscanf()
      ||  0;
}

Then the testing functions can return non-zero to stop producing output.

No error reporting

A syntax error during parsing will result in an empty list being returned. Graham Hutton's paper describes how to rewrite the basic parser combinators so that meaningful error messages can be produced -- without using Monad Transformers which is the more typical way in functional languages.

added 668 characters in body

Source Link

edited Jul 30, 2020 at 8:54

luser droog

2.1k
15
32

Bug: singleton objects have no allocation record

Since the garbage collector will try to set the mark() in a SYMBOL object, the T_ object needs a dummy allocation record. NIL_ doesn't need one since an INVALID object will not get marked.

pc9obj.c:

object T_ = &(1[(union uobject[]){{ .t = 0 },{ .Symbol = { SYMBOL, T, "T" } }}]),
       NIL_ = (union uobject[]){{ .t = INVALID }};

Bug: using object fields for non-object data

In the pprintf() and pscanf() functions, the object field in OPERATOR objects sometimes contains a va_list *! The garbage collector might fiddle with the memory around this address if it tries to set the (non-existant) mark(). The copious (void *) casts are a code smell. Better to use the VOID type object to hold this pointer.

Missing functions

There's some for 1 or more, many for 0 or more, maybe for 0 or 1. But there's no function to match n times, or n or more, or n up to m times — these kind of quantifiers.

Poor namespacing for internal symbols

enum parser_symbols {
  VALUE = SYM1, PRED, P, PP, NN, Q, R, FF, XX, AA, ID, USE, ATOM,
  SYM2
};

What are P, PP, NN, Q, R, FF, XX, AA? VALUE PRED and ATOM are better but still kinda vague.

Short-circuit tests (and maybe actually test stuff)

int par_main(){
  return 
      obj_main(),
      test_env(), test_parsers(),
      test_regex(),
          test_pprintf(),
          test_pscanf(),
          0;
}

Bonus formatting error. Better to short-circuit the tests based on the return values.

int par_main(){
  return  0
      ||  obj_main()
      ||  test_env()
      ||  test_parsers()
      ||  test_regex()
      ||  test_pprintf()
      ||  test_pscanf()
      ||  0;
}

Then the testing functions can return non-zero to stop producing output.

Bug: singleton objects have no allocation record

Since the garbage collector will try to set the mark() in a SYMBOL object, the T_ object needs a dummy allocation record. NIL_ doesn't need one since an INVALID object will not get marked.

pc9obj.c:

object T_ = &(1[(union uobject[]){{ .t = 0 },{ .Symbol = { SYMBOL, T, "T" } }}]),
       NIL_ = (union uobject[]){{ .t = INVALID }};

Bug: using object fields for non-object data

In the pprintf() and pscanf() functions, the object field in OPERATOR objects sometimes contains a va_list *! The garbage collector might fiddle with the memory around this address if it tries to set the (non-existant) mark(). The copious (void *) casts are a code smell. Better to use the VOID type object to hold this pointer.

Missing functions

There's some for 1 or more, many for 0 or more, maybe for 0 or 1. But there's no function to match n times, or n or more, or n up to m times — these kind of quantifiers.

Poor namespacing for internal symbols

enum parser_symbols {
  VALUE = SYM1, PRED, P, PP, NN, Q, R, FF, XX, AA, ID, USE, ATOM,
  SYM2
};

What are P, PP, NN, Q, R, FF, XX, AA? VALUE PRED and ATOM are better but still kinda vague.

Bug: singleton objects have no allocation record

Since the garbage collector will try to set the mark() in a SYMBOL object, the T_ object needs a dummy allocation record. NIL_ doesn't need one since an INVALID object will not get marked.

pc9obj.c:

object T_ = &(1[(union uobject[]){{ .t = 0 },{ .Symbol = { SYMBOL, T, "T" } }}]),
       NIL_ = (union uobject[]){{ .t = INVALID }};

Bug: using object fields for non-object data

In the pprintf() and pscanf() functions, the object field in OPERATOR objects sometimes contains a va_list *! The garbage collector might fiddle with the memory around this address if it tries to set the (non-existant) mark(). The copious (void *) casts are a code smell. Better to use the VOID type object to hold this pointer.

Missing functions

There's some for 1 or more, many for 0 or more, maybe for 0 or 1. But there's no function to match n times, or n or more, or n up to m times — these kind of quantifiers.

Poor namespacing for internal symbols

enum parser_symbols {
  VALUE = SYM1, PRED, P, PP, NN, Q, R, FF, XX, AA, ID, USE, ATOM,
  SYM2
};

What are P, PP, NN, Q, R, FF, XX, AA? VALUE PRED and ATOM are better but still kinda vague.

Short-circuit tests (and maybe actually test stuff)

int par_main(){
  return 
      obj_main(),
      test_env(), test_parsers(),
      test_regex(),
          test_pprintf(),
          test_pscanf(),
          0;
}

Bonus formatting error. Better to short-circuit the tests based on the return values.

int par_main(){
  return  0
      ||  obj_main()
      ||  test_env()
      ||  test_parsers()
      ||  test_regex()
      ||  test_pprintf()
      ||  test_pscanf()
      ||  0;
}

Then the testing functions can return non-zero to stop producing output.

added 266 characters in body

Source Link

edited Jul 30, 2020 at 8:20

luser droog

2.1k
15
32

##Bug: singleton objects have no allocation record

Bug: singleton objects have no allocation record

Since the garbage collector will try to set the mark() in a SYMBOL object, the T_ object needs a dummy allocation record. NIL_ doesn't need one since an INVALID object will not get marked.

pc9obj.c:

object T_ = &(1[(union uobject[]){{ .t = 0 },{ .Symbol = { SYMBOL, T, "T" } }}]),
       NIL_ = (union uobject[]){{ .t = INVALID }};

##Bug: using object fields for non-object data

Bug: using object fields for non-object data

In the pprintf() and pscanf() functions, the object field in OPERATOR objects sometimes contains a va_list *! The garbage collector might fiddle with the memory around this address if it tries to set the (non-existant) mark(). The copious (void *) casts are a code smell. Better to use the VOID type object to hold this pointer.

##Missing functions

Missing functions

There's some for 1 or more, many for 0 or more, maybe for 0 or 1. But there's no function to match n times, or n or more, or n up to m times — these kind of quantifiers.

Poor namespacing for internal symbols

enum parser_symbols {
  VALUE = SYM1, PRED, P, PP, NN, Q, R, FF, XX, AA, ID, USE, ATOM,
  SYM2
};

What are P, PP, NN, Q, R, FF, XX, AA? VALUE PRED and ATOM are better but still kinda vague.

##Bug: singleton objects have no allocation record

Since the garbage collector will try to set the mark() in a SYMBOL object, the T_ object needs a dummy allocation record. NIL_ doesn't need one since an INVALID object will not get marked.

pc9obj.c:

object T_ = &(1[(union uobject[]){{ .t = 0 },{ .Symbol = { SYMBOL, T, "T" } }}]),
       NIL_ = (union uobject[]){{ .t = INVALID }};

##Bug: using object fields for non-object data

In the pprintf() and pscanf() functions, the object field in OPERATOR objects sometimes contains a va_list *! The garbage collector might fiddle with the memory around this address if it tries to set the (non-existant) mark(). The copious (void *) casts are a code smell. Better to use the VOID type object to hold this pointer.

##Missing functions

There's some for 1 or more, many for 0 or more, maybe for 0 or 1. But there's no function to match n times, or n or more, or n up to m times — these kind of quantifiers.

Bug: singleton objects have no allocation record

Since the garbage collector will try to set the mark() in a SYMBOL object, the T_ object needs a dummy allocation record. NIL_ doesn't need one since an INVALID object will not get marked.

pc9obj.c:

object T_ = &(1[(union uobject[]){{ .t = 0 },{ .Symbol = { SYMBOL, T, "T" } }}]),
       NIL_ = (union uobject[]){{ .t = INVALID }};

Bug: using object fields for non-object data

In the pprintf() and pscanf() functions, the object field in OPERATOR objects sometimes contains a va_list *! The garbage collector might fiddle with the memory around this address if it tries to set the (non-existant) mark(). The copious (void *) casts are a code smell. Better to use the VOID type object to hold this pointer.

Missing functions

There's some for 1 or more, many for 0 or more, maybe for 0 or 1. But there's no function to match n times, or n or more, or n up to m times — these kind of quantifiers.

Poor namespacing for internal symbols

enum parser_symbols {
  VALUE = SYM1, PRED, P, PP, NN, Q, R, FF, XX, AA, ID, USE, ATOM,
  SYM2
};

What are P, PP, NN, Q, R, FF, XX, AA? VALUE PRED and ATOM are better but still kinda vague.

added 206 characters in body

Source Link

edited Jun 23, 2019 at 21:51

luser droog

2.1k
15
32

Loading

Source Link

answered May 15, 2019 at 3:34

luser droog

2.1k
15
32

Loading

Return to Answer

Bug: singleton objects have no allocation record

Bug: using object fields for non-object data

Missing functions

Poor namespacing for internal symbols

Short-circuit tests (and maybe actually test stuff)

No error reporting

Bug: singleton objects have no allocation record

Bug: using object fields for non-object data

Missing functions

Poor namespacing for internal symbols

Short-circuit tests (and maybe actually test stuff)

Bug: singleton objects have no allocation record

Bug: using object fields for non-object data

Missing functions

Poor namespacing for internal symbols

Short-circuit tests (and maybe actually test stuff)

No error reporting

Bug: singleton objects have no allocation record

Bug: using object fields for non-object data

Missing functions

Poor namespacing for internal symbols

Short-circuit tests (and maybe actually test stuff)

Bug: singleton objects have no allocation record

Bug: using object fields for non-object data

Missing functions

Poor namespacing for internal symbols

Bug: singleton objects have no allocation record

Bug: using object fields for non-object data

Missing functions

Poor namespacing for internal symbols

Short-circuit tests (and maybe actually test stuff)

Bug: singleton objects have no allocation record

Bug: using object fields for non-object data

Missing functions

Poor namespacing for internal symbols

Bug: singleton objects have no allocation record

Bug: using object fields for non-object data

Missing functions

Poor namespacing for internal symbols