I wrote a toy s-expression parser, and I'd like to know if I can make it more Rusty. I'm not terribly worried about the functionality. It's only a Rust exercise for me.
enum Token {
Number(u64),
Identifier(String),
ParOpen,
ParClose,
}
fn tokenize (source: &str) -> Result<Vec<Token>, String> {
let mut tokens: Vec<Token> = Vec::new();
let regex = Regex::new(r"[0-9][-+*?!_a-zA-Z0-9]*|[-+*?!_a-zA-Z][-+*?!_a-zA-Z0-9]*|\(|\)").unwrap();
for candidate_match in regex.find_iter(source) {
let candidate_string = candidate_match.as_str();
match candidate_string.chars().next().unwrap() {
'-' | '+' | '*' | '?' | '!' | '_' | 'a'..='z' | 'A'..='Z' => tokens.push(Token::Identifier(candidate_string.to_owned())),
'0'..='9' => {
match candidate_string.parse::<u64>() {
Ok(n) => tokens.push(Token::Number(n)),
Err(_) => return Err(format!("Invalid number: {}", candidate_string)),
}
},
'(' => tokens.push(Token::ParOpen),
')' => tokens.push(Token::ParClose),
_ => (),
}
}
Ok(tokens)
}
enum Node {
Number(u64),
Identifier(String),
List(Vec<Node>),
}
fn parse (tokens: &[Token]) -> Result<Node, String> {
let mut index: usize = 0;
let mut stack: Vec<Vec<Node>> = vec![Vec::new()];
loop {
match &tokens[index] {
Token::Number(value) => {
stack.last_mut().unwrap().push(Node::Number(value.clone()));
},
Token::Identifier(name) => {
stack.last_mut().unwrap().push(Node::Identifier(name.clone()));
}
Token::ParOpen => {
stack.push(Vec::new());
},
Token::ParClose => {
if stack.len() <= 1 {
return Err("Unexpected ')'".to_owned());
}
let list = Node::List(stack.pop().unwrap());
stack.last_mut().unwrap().push(list);
},
}
if index >= tokens.len() - 1 {
if stack.len() > 1 {
return Err("Expected ')'".to_owned());
}
return Ok(Node::List(stack.pop().unwrap()));
}
index += 1;
}
}
fn stringify (node: &Node) -> String {
fn stringify (node: &Node) -> String {
match node {
Node::Number(value) => value.to_string(),
Node::Identifier(name) => name.clone(),
Node::List(entries) => format!("({})", entries.iter().map(stringify).collect::<Vec<String>>().join(" ")),
}
}
match node {
Node::Number(value) => value.to_string(),
Node::Identifier(name) => name.clone(),
Node::List(entries) => entries.iter().map(stringify).collect::<Vec<String>>().join(" "),
}
}
https://godbolt.org/z/fP6fd41Tr
I am not concerned whether this is a good s-expression parser or not. This is just Rust exercise for me, not a parsing one. I am interested in Rust-related feedback.
I'm a bit new to the language, and my code looks verbose to me. Perhaps I can simplify this? Any of it? Am I copying data around without good reason? Is there a nice way to compile that regex once and store it somewhere?