1

I am attempting to send a Json request to a model on google cloud's ml-engine. This requires a json in the form

For which I need to convert a float array to a single base64 encoded string.

I thought perhaps the google protobuf ByteString would be what I am looking for, but that seems to behave the same way as the byte array (stackoverflow question on the difference between the two).

My current approach to create the value for the "b64" key creates an array of byte strings, which leads to a google cloud error (see other question).

  public static String[] convertToBase64Bytes(float[] audio) {
    String[] data = new String[audio.length];
    for (int i = 0; i < audio.length; i++) {
      float amplitude = audio[i];
      byte[] byteArray = ByteBuffer.allocate(4).putFloat(amplitude).array();
      data[i] = Base64.encodeToString(byteArray, Base64.DEFAULT);
    }
    return data;
  }

I have been unable to find how I can convert the entire float array to a single base64 byte string, which the ml-engine can then convert back to the original array.

In case its useful, the way I did this is Python was

bytes_string = audio_array.tostring() #audio_array is a numpy array
encoded = base64.b64encode(bytes_string)

Would anyone be able to help with this? Thanks.

1
  • Sorry I mistakendly editted you questino and deleted the json example. I seem not to find a way to undo it. Commented Feb 28, 2018 at 15:54

2 Answers 2

3
public static String convertToBase64Bytes(float[] audio) { 
     ByteBuffer buff = ByteBuffer.allocate(4 * audio.length);
     for (int i = 0; i < audio.length; i++) {
       float amplitude = audio[i]; 
       buff.putFloat(amplitude);
     }
     String data = Base64.getEncoder().encodeToString(buff.array(), Base64.DEFAULT);
     return data; 
   }
Sign up to request clarification or add additional context in comments.

Comments

0

For this solution I am using Gson() (you can the jar or the maven dependencies from here) to generate the final string as in your example.

And I have created a couple of helper classes that you could put somewhere else in the project (not necessarily as inner classes).

The main method is just to provide the means of running the code.

The output looks like this:

EDITED I

Original Solution for one b64 item for each float.

{"instances":[{"b64":"QUczMw=="},{"b64":"QgpmZg=="},{"b64":"wgHS8g=="},{"b64":"QU+uFA=="}]}

The code:

public class FloatEncoder {

    public static void main(String args[]) {
        FloatEncoder encoder = new FloatEncoder();

        float [] floats = new float[] {12.45f, 34.6f, -32.456f, 12.98f};
        String encodedJson = encoder.encode(floats);
        System.out.println(encodedJson);
    }

    private String encode(float[] floats) {
        String rtn;
        DataHolder holder = new DataHolder();


        String [] audios = convertToBase64Bytes(floats);

        for(String audio : audios) {
            B64 b64 = new B64();
            b64.b64 = audio;
            holder.instances.add(b64);
        }

        Gson gson = new GsonBuilder().disableHtmlEscaping().create();
        rtn = gson.toJson(holder);

        return rtn;
    }

      public static String[] convertToBase64Bytes(float[] audio) {
        String[] data = new String[audio.length];
        for (int i = 0; i < audio.length; i++) {
          float amplitude = audio[i];
          byte[] byteArray = ByteBuffer.allocate(4).putFloat(amplitude).array();
          data[i] = Base64.getEncoder().encodeToString(byteArray);
        }
        return data;
      }

      public static class DataHolder{
          public ArrayList<B64> instances = new ArrayList<>();
      }

      public static class B64{
          public String b64;
      }
}

EDIT II

Solution for one b64 item with the array of floats encoded as a single string.

{"instances":[{"b64":"QUczM0IKZmbCAdLyQU+uFA=="}]}

The string is the Base64 encoding of a byte array where the first 4 bytes are the first float, the second 4 are the second float and so forth.

public class FloatEncoder {

    public static void main(String args[]) {
        FloatEncoder encoder = new FloatEncoder();

        float [] floats = new float[] {12.45f, 34.6f, -32.456f, 12.98f};
        String encodedJson = encoder.encode(floats);
        System.out.println(encodedJson);
    }

    private String encode(float[] floats) {
        String rtn;
        DataHolder holder = new DataHolder();


        String audios = convertToBase64Bytes(floats);
        B64 b64 = new B64();
        b64.b64 = audios;
        holder.instances.add(b64);

        Gson gson = new GsonBuilder().disableHtmlEscaping().create();
        rtn = gson.toJson(holder);

        return rtn;
    }

      public static String convertToBase64Bytes(float[] audio) {
        ByteBuffer byteBuffer = ByteBuffer.allocate(4 * audio.length);
        for (int i = 0; i < audio.length; i++) {
          float amplitude = audio[i];  
          byteBuffer.putFloat(amplitude);
        }
        byte[] data = byteBuffer.array();
        String rtn = Base64.getEncoder().encodeToString(data);
        return rtn;
      }

      public static class DataHolder{
          public ArrayList<B64> instances = new ArrayList<>();
      }

      public static class B64{
          public String b64;
      }
}

7 Comments

Thanks! A couple clarifications. 1) I'm aiming to create one byte string per float array, your example gives each float in the array as a separate byte string. I attempted to create a string by concatenating the audio strings in the encode for loop, however it was very slow. Would you have a better suggestion? 2) Do you know why the strings your method makes ("QUczMw\u003d\u003d\nQgpmZg\u003d\u003d\nwgHS8g\u003d\u003d\nQU+uFA\u003d\u003d\n" when concatenated) is different to @user:4956493 answer (QUczM0IKZmbCAdLyQU+uFA==)?
A float in java is 4 bytes. You can't fit 4 bytes in one byte. When you encode base64 the number of bytes grows because of how the encoding works to bring values into the printable characters domain. By default json escapes html so I made a change for it not to so you can see the == symbols.
The concatenation you mention won't work. I thought the format you needed was the one you exposed (and I mistakendly deleted :( ). You need to be clear on how the information must reach destination. If it is you on the other side receiving the json, you can easily undo the encoding.
My bad with the example, I meant to show that multiple audio arrays could be sent, I'll make that clearer. The other side is google cloud, which will automatically decode a base64 string of an array into bytes, so I need to be able to send a single string which represents the entire array of floats. I have managed to do this in python, so I expect it to be possible in java too?
The yosher lutskis answer does send a single byte string which google cloud accepts, but your answer looks more like the form of byte strings I have seen before (i.e. the result of [numpy_array].tostring()), so I was wondering if your version could be made into a single string
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.