4
\$\begingroup\$

(The story continues in A simple method for compressing white space in text (Java) - Take II.)

Intro

Now I have that text space compressor. For example,

"   hello   world "      -> "hello world" 
" \n world  \t hello   " -> "world hello"

Code

package io.github.coderodde.text;

import java.util.Objects;

/**
 * This class provides a linear time method for compressing space 
 * @author Rodion "rodde" Efremov
 * @version 1.0.0 (Oct 30, 2025)
 * @since 1.0.0 (Oct 30, 2025)
 */
public final class TextSpaceCompressor {

    public static String spaceCompress(String text) {
        Objects.requireNonNull(text);
        StringBuilder sb = new StringBuilder();
        
        int textLength = text.length();
        int loIndex = 0;
        int hiIndex = textLength - 1;
        
        // Scan empty prefix if any:
        for (; loIndex < hiIndex; ++loIndex) {
            if (!Character.isWhitespace(text.charAt(loIndex))) {
                break;
            }
        }
        
        // Scan empty suffix is any:
        for (; hiIndex > loIndex; --hiIndex) {
            if (!Character.isWhitespace(text.charAt(hiIndex))) {
                break;
            }
        }
        
        if (loIndex == hiIndex) {
            // The input text is blank:
            return "";
        }
        
        boolean scanningSpaceSequence = false;
        
        while (loIndex <= hiIndex) {
            char ch = text.charAt(loIndex);
            
            if (!Character.isWhitespace(ch)) {
                sb.append(ch);
                scanningSpaceSequence = false;
            } else if (!scanningSpaceSequence) {
                    scanningSpaceSequence = true;
                    sb.append(' ');
            }
            
            loIndex++;
        }
        
        return sb.toString();
    }
    
    public static void main(String[] args) {
        System.out.println(spaceCompress("hello  world"));
        System.out.println(spaceCompress("  hello  world"));
        System.out.println(spaceCompress("hello    world   "));
        System.out.println(spaceCompress("    hello \t world   "));
        System.out.println(spaceCompress("  cat  \t \t  dog  \n mouse  "));
    }
}

Output

hello world
hello world
hello world
hello world
cat dog mouse

Critique request

As always, tell me anything that comes to mind.

\$\endgroup\$

1 Answer 1

6
\$\begingroup\$

Simplicity

It strikes me that the simplest method to do this is to leverage the standard library. You will want to learn how to use regular expressions. Your effort will be rewarded.

public final class TextSpaceCompressor {
    public static String spaceCompress(String text) {
        return text.strip().replaceAll("\\s+", " ");
    }
}

Now, this will also remove newlines. Your test examples indicate you're okay doing this, but if you wanted to preserve newlines you could use streams to map over the lines, perform the substitutions and then collect the string back together, joining with newlines.

import java.util.stream.Collectors;

public final class TextSpaceCompressor {
    public static String spaceCompress(String text) {
        return text
            .lines()
            .map(line -> line.strip().replaceAll("\\s+", " "))
            .collect(Collectors.joining("\n"));
    }
}

You might further wish to remove empty lines. For instance, "hello world \n foo \n \n bar" becoming "hello world\nfoo\nbar".

import java.util.stream.Collectors;

public final class TextSpaceCompressor {
    public static String spaceCompress(String text) {
        return text
            .lines()
            .map(line -> line.strip().replaceAll("\\s+", " "))
            .filter(line -> line != "")
            .collect(Collectors.joining("\n"));
    }
}

Comments on your code

I note this loop:

        while (loIndex <= hiIndex) {
            char ch = text.charAt(loIndex);
            
            if (!Character.isWhitespace(ch)) {
                sb.append(ch);
                scanningSpaceSequence = false;
            } else if (!scanningSpaceSequence) {
                    scanningSpaceSequence = true;
                    sb.append(' ');
            }
            
            loIndex++;
        }
  • You have some inconsistent whitespace.
  • In either branch of the conditional you set the value of scanningSpaceSequence but you do it in one case before appending to your string buffer, and in one case after. The order doesn't matter, so it feels odd that this is not consistent.
  • Since the increment of loIndex is not conditional, this might be better suited to a for loop.
        for (; loIndex <= hiIndex; loIndex++) {
            char ch = text.charAt(loIndex);
            
            if (!Character.isWhitespace(ch)) {
                sb.append(ch);
                scanningSpaceSequence = false;
            } else if (!scanningSpaceSequence) {
                sb.append(' ');
                scanningSpaceSequence = true;
            }
        }

In your main method, it would be a good idea to create an array of strings, and then iterate over them. This will greatly facilitate adding test cases.

    public static void main(String[] args) {
        String[] tests = {
            "hello  world",
            "  hello  world",
            "hello    world   ",
            "    hello \t world   ",
            "  cat  \t \t  dog  \n mouse  "
        };

        for (String test : tests) {
            System.out.println(spaceCompress(test));
        }
    }
\$\endgroup\$
3
  • \$\begingroup\$ Very nice indeed! \$\endgroup\$ Commented yesterday
  • \$\begingroup\$ Even more funky, thanks! \$\endgroup\$ Commented yesterday
  • 1
    \$\begingroup\$ The regex way is probably still faster in this case (because of many .charAt calls in the OP), but sometimes rewriting regex-based processing with simple string manipulation routines can pay off. Yes, it's longer and error-prone, but may help squeeze a few more cycles in a hot loop. \$\endgroup\$ Commented 22 hours ago

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.