I'm working on an application that allows users edit/fix XML. A part of this is to format the XML for better readability.
As the XML might be invalid, the existing methods I found for formatting (like XmlWriter or XDocument) don't work for me.
There might be all sorts of problems with the XML, although the most common is unescaped special characters.
public static string FormatXml(string xml)
{
    var tags = xml
        .Split('<')
        .Select(tag => tag.TrimEnd().EndsWith(">") ? tag.TrimEnd() : tag); //Trim whitespace between tags, but not at the end of values
    var previousTag = tags.First(); //Preserve content before the first tag, e.g. if the initial < is missing
    var formattedXml = new StringBuilder(previousTag);
    var indention = 0;
    
    foreach (var tag in tags.Skip(1))
    {
        if (previousTag.EndsWith(">"))
        {
            formattedXml.AppendLine();
            if (tag.StartsWith("/"))
            {
                indention = Math.Max(indention - 1, 0);
                formattedXml.Append(new string('\t', indention));
            }
            else
            {
                formattedXml.Append(new string('\t', indention));
                if (!tag.EndsWith("/>"))
                {
                    indention++;
                }
            }
        }
        else
        {
            indention = Math.Max(indention - 1, 0);
        }
        formattedXml.Append("<");
        formattedXml.Append(tag);
        previousTag = tag;
    }
    return formattedXml.ToString();
}
Sofar the method produces reasonable output for all cases I came up with.
I'm mostly worried that I missed some special cases of valid XML that would get messed up.
xmlpassed to the method before or after the user edit the xml? \$\endgroup\$