Word to string array in special cases

Hi everyone, I am wondering how to achieve this situation:

string mainWord = "LJERKA";

private void BuildStringArray(string mainWord)
{
     string[] mainWordArray = mainWord.ToCharArray().Select(c => c.ToString()).ToArray();
}

When I run this method I get the expected results, array of strings:
[0] L
[1] J
[2] E
[3] R
[4] K
[5] A

But I want to achieve this:
[0] LJ
[1] E
[2] R
[3] K
[4] A

I need to concatenate some characters into one string, for example:
LJ, NJ and DŽ are all needed to be as one string and not separate item in the array.

Do you have any ideas how to achieve this?

Regards

You can treat it a bit like a finite state machine and keep track of your current state for as far as needed. For example, the last character was L or something else.

enum State {L, Other}

State state = State.Other;
List<string> result = new List<string>();
for (int i = 0;i < mainWord.Length();i++)
{
    char ch = mainWord[i];
    if (state == State.L)
    {
        if (ch == 'J')
        {
            result.Add("LJ");
        }
        else
        {
             result.Add("L");
        }
    }
    state = State.Other;
    if (ch == 'L')
    {
        state = State.L;
    }
    else
    {
        result.Add("" + ch);
    }
}
if (state == State.L)
{
    result.Add("L");
}

Pretty long code, I must admit, but it does the job in a single pass over the input string.

not sure if it catches all the needed letters in all cases, but seems to work on this one,

using UnityEngine;
using System.Text.RegularExpressions;

public class FindLetters : MonoBehaviour
{

    void Start()
    {
        // add here your special 2 letter words, and . matches for rest of the letters
        string pattern = @"(LJ|NJ|DŽ|.)";
        string input = @"LJERKALJNJDŽALVJGNYJUDRŽ";

        MatchCollection results = Regex.Matches(input, pattern, RegexOptions.IgnoreCase);

        for (int i = 0, len = results.Count; i < len; i++)
        {
            Debug.LogFormat("index:{0} letter:{1}", i, results[i].ToString());
        }
    }
}

*credits: https://regex101.com + google : )

Yes, ok, a regular expression is a better way to go.

Hi, thank you for sharing your idea, I took it and improved it a bit. Working like a charm:

private List<string> WordToCroatianStringArray(string mainWord)
    {
        State state = State.Other;
        List<string> result = new List<string>();

        for (int i = 0; i < mainWord.Length; i++)
        {
            bool doubleCharAdded = false;
            char ch = mainWord[i];
            if (state == State.L)
            {
                if (ch == 'j')
                {
                    result.Add("lj");
                    doubleCharAdded = true;
                }
                else
                {
                    result.Add("l");
                }
            }
            else if(state == State.N)
            {
                if (ch == 'j')
                {
                    result.Add("nj");
                    doubleCharAdded = true;
                }
                else
                {
                    result.Add("n");
                }
            }
            else if (state == State.D)
            {
                if (ch == 'ž')
                {
                    result.Add("dž");
                    doubleCharAdded = true;
                }
                else
                {
                    result.Add("d");
                }
            }

            state = State.Other;

            if (ch == 'l')
            {
                state = State.L;
            }
            else if (ch == 'n')
            {
                state = State.N;
            }
            else if (ch == 'd')
            {
                state = State.D;
            }
            else
            {
                if(!doubleCharAdded)
                    result.Add("" + ch);
            }
        }

        // add last letter if it is one of the specials
        if (state == State.L)
        {
            result.Add("l");
        }
        if (state == State.N)
        {
            result.Add("n");
        }
        if (state == State.D)
        {
            result.Add("d");
        }

        return result;
    }

Tested with this:

List<string> rest = WordToCroatianStringArray("ljekarna");
        foreach(string r in rest)
        {
            Debug.Log(r);
        }
        Debug.Log("--------------------");
        List<string> rest2 = WordToCroatianStringArray("poboljšanje");
        foreach (string r in rest2)
        {
            Debug.Log(r);
        }
        Debug.Log("--------------------");
        List<string> rest3 = WordToCroatianStringArray("džentlmen");
        foreach (string r in rest3)
        {
            Debug.Log(r);
        }
        Debug.Log("--------------------");
        List<string> rest4 = WordToCroatianStringArray("jednadžba");
        foreach (string r in rest4)
        {
            Debug.Log(r);
        }

I will try now with regex :slight_smile:

Regex working like a charm too :slight_smile: