Questions tagged [strings]
For questions about the implementation or design of strings within a programming language.
21 questions
10
votes
5
answers
1k
views
What are the considerations when modifying cases in a text?
By "cases" I mean uppercase, lowercase, and titlecase.
It seems many languages assumes that there is one-to-one correspondence of uppercase letters and lowercase letters, if the script that ...
20
votes
5
answers
3k
views
UTF-32 worth it for the string data type?
I've heard of Truffle strings from GraalVM which stores a string in one of multiple encodings, which might be perhaps like what Python (not CPython) uses for representing strings, though searching a ...
0
votes
1
answer
428
views
How might I implement a `typeid` operator (returning the type of its argument as something, presumably as a string) in my compiler?
So, here is how strings work in my AEC-to-WebAssembly compiler. After the program is parsed, all strings except inline assembly ones are gathered in a C++ ...
6
votes
0
answers
318
views
What was the first language to use backslashes as escape characters in string literals?
I assume that C didn't originate the idea that, for example, the sequence \t inside a string literal should mean a tab character, that ...
9
votes
1
answer
391
views
Creating an interface for stringifying objects that allows loop detection
Suppose a language has a rust-style trait system but is garbage collected and most types are reference types by default.
I could create a trait like this:
...
26
votes
6
answers
13k
views
Why do programming languages use delimiters (quotes) for strings?
Almost every programming language requires strings (or char* or equivelent) to be marked with quotes. Few languages allow other delimiters, many languages allow ...
12
votes
4
answers
563
views
Prior art on modeling characters of variable lengths
In some encodings such as UTF-8, characters are of variable length in bytes. It's a bit like a tagged union, but the exact size could be computed, so the next element in a string could follow ...
10
votes
1
answer
3k
views
In Python, why isn't it a syntax error when a list of strings is defined without comma separators?
When coding in Python, I found that defining a list of strings without separating the strings with a comma is not a syntax error. When running this code:
...
3
votes
5
answers
761
views
To what degree is it practical to let objects be stored packed? [closed]
Let's say I want my compiler to store objects packed (data and collection members aligned to non-word boundaries).
Situation #1: I want to store an array of booleans.
A few languages already let the ...
6
votes
2
answers
621
views
Why would Short String Optimization not apply to dynamic arrays?
Short string optimization is the optimization that sufficiently short strings have their data stored inline rather than an external buffer, so the string type ends up being a union. Swift does this, ...
17
votes
11
answers
2k
views
What are some different approaches to raw string syntax, and what are their pros and cons?
Many languages include a "raw string" literal type, intended for arbitrary (or as close as possible to arbitrary) strings of characters (and/or bytes), ignoring escape sequences and the like....
30
votes
6
answers
2k
views
How have modern language designs dealt with Unicode strings?
Languages developed over the last fifteen years or so have been within the era where Unicode is ubiquitous, and so could design their core string types accordingly. There are a lot of new issues that ...
8
votes
1
answer
389
views
Why do many dynamically typed languages identify types with strings?
I noticed in JavaScript and Lua, 'types' i.e., those returned by typeof()/type() are just identified by strings. As such we see ...
13
votes
4
answers
6k
views
What design trade-offs led to the "Norway problem" in YAML, and when are they worthwhile?
A well-known problem in YAML is a type-inference issue in parsing where a string is misinterpreted as a boolean. This is known as the "Norway problem", because it occurs when a field or ...
7
votes
9
answers
811
views
What are the different syntax options for formatted string literals?
One option in Python is the f-string:
f"{a} {b} {c}"
Which is equivalent to:
...