Skip to main content

Questions tagged [strings]

For questions about the implementation or design of strings within a programming language.

10 votes
5 answers
1k views

What are the considerations when modifying cases in a text?

By "cases" I mean uppercase, lowercase, and titlecase. It seems many languages assumes that there is one-to-one correspondence of uppercase letters and lowercase letters, if the script that ...
Dannyu NDos's user avatar
  • 1,485
20 votes
5 answers
3k views

UTF-32 worth it for the string data type?

I've heard of Truffle strings from GraalVM which stores a string in one of multiple encodings, which might be perhaps like what Python (not CPython) uses for representing strings, though searching a ...
Hydroper's user avatar
  • 397
0 votes
1 answer
428 views

How might I implement a `typeid` operator (returning the type of its argument as something, presumably as a string) in my compiler?

So, here is how strings work in my AEC-to-WebAssembly compiler. After the program is parsed, all strings except inline assembly ones are gathered in a C++ ...
FlatAssembler's user avatar
6 votes
0 answers
318 views

What was the first language to use backslashes as escape characters in string literals?

I assume that C didn't originate the idea that, for example, the sequence \t inside a string literal should mean a tab character, that ...
Karl Knechtel's user avatar
9 votes
1 answer
391 views

Creating an interface for stringifying objects that allows loop detection

Suppose a language has a rust-style trait system but is garbage collected and most types are reference types by default. I could create a trait like this: ...
mousetail's user avatar
  • 9,559
26 votes
6 answers
13k views

Why do programming languages use delimiters (quotes) for strings?

Almost every programming language requires strings (or char* or equivelent) to be marked with quotes. Few languages allow other delimiters, many languages allow ...
Safwan Samsudeen's user avatar
12 votes
4 answers
563 views

Prior art on modeling characters of variable lengths

In some encodings such as UTF-8, characters are of variable length in bytes. It's a bit like a tagged union, but the exact size could be computed, so the next element in a string could follow ...
user23013's user avatar
  • 3,314
10 votes
1 answer
3k views

In Python, why isn't it a syntax error when a list of strings is defined without comma separators?

When coding in Python, I found that defining a list of strings without separating the strings with a comma is not a syntax error. When running this code: ...
Redz's user avatar
  • 1,096
3 votes
5 answers
761 views

To what degree is it practical to let objects be stored packed? [closed]

Let's say I want my compiler to store objects packed (data and collection members aligned to non-word boundaries). Situation #1: I want to store an array of booleans. A few languages already let the ...
Dannyu NDos's user avatar
  • 1,485
6 votes
2 answers
621 views

Why would Short String Optimization not apply to dynamic arrays?

Short string optimization is the optimization that sufficiently short strings have their data stored inline rather than an external buffer, so the string type ends up being a union. Swift does this, ...
Bbrk24's user avatar
  • 9,672
17 votes
11 answers
2k views

What are some different approaches to raw string syntax, and what are their pros and cons?

Many languages include a "raw string" literal type, intended for arbitrary (or as close as possible to arbitrary) strings of characters (and/or bytes), ignoring escape sequences and the like....
rydwolf's user avatar
  • 4,870
30 votes
6 answers
2k views

How have modern language designs dealt with Unicode strings?

Languages developed over the last fifteen years or so have been within the era where Unicode is ubiquitous, and so could design their core string types accordingly. There are a lot of new issues that ...
Michael Homer's user avatar
  • 15.3k
8 votes
1 answer
389 views

Why do many dynamically typed languages identify types with strings?

I noticed in JavaScript and Lua, 'types' i.e., those returned by typeof()/type() are just identified by strings. As such we see ...
CPlus's user avatar
  • 10.5k
13 votes
4 answers
6k views

What design trade-offs led to the "Norway problem" in YAML, and when are they worthwhile?

A well-known problem in YAML is a type-inference issue in parsing where a string is misinterpreted as a boolean. This is known as the "Norway problem", because it occurs when a field or ...
Michael Homer's user avatar
  • 15.3k
7 votes
9 answers
811 views

What are the different syntax options for formatted string literals?

One option in Python is the f-string: f"{a} {b} {c}" Which is equivalent to: ...
The Thonnu's user avatar
  • 1,576

15 30 50 per page