The Wayback Machine - https://web.archive.org/web/20201111073056/https://github.com/tensorflow/hub/issues/344
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where should I get the detailed information of the embeddings of the USE model #344

Closed
yoheikikuta opened this issue Aug 3, 2019 · 2 comments

Comments

@yoheikikuta
Copy link

@yoheikikuta yoheikikuta commented Aug 3, 2019

Hi.

I wanna understand the embeddings of the USE model in detail; where should I get the info?

For example, ELMo's embeddings are described on https://tfhub.dev/google/elmo/2.
But, in the case of USE, there is only a description the output is a 512 dimensional vector on https://tfhub.dev/google/universal-sentence-encoder/2.
From where is the output coming?

I could find the output is corresponding to a tensor <tf.Tensor 'module_apply_default/Encoder_en/hidden_layers/l2_normalize:0' shape=(?, 512) dtype=float32> of the model, but it's not easy to identify what value is exactly computed.

I read the paper of USE, so I can guess the output is like Σ_w Embed(w) / √sentence length. But I'm not sure which layer is used as the embeddings, the last layer of the Transformer Encoder? the first embedding lookup layer? or else?

Thanks.

@yoheikikuta
Copy link
Author

@yoheikikuta yoheikikuta commented Aug 12, 2019

I misunderstood the model and wrote wrong information on the description.
There are some different models of USE in TensorFlow Hub.

To clarify differences, I checked the graph visualizations of USE models (universal-sentence-encoder-large/3 and universal-sentence-encoder/2) on TensorBoard.

https://tfhub.dev/google/universal-sentence-encoder-large/3

The model default output is <tf.Tensor 'module_apply_default/Encoder_en/hidden_layers/l2_normalize:0' shape=(2, 512) dtype=float32>.
It's understandable from the following graph:

https://tfhub.dev/google/universal-sentence-encoder/2

The model default output is <tf.Tensor 'module_apply_default/Encoder_en/hidden_layers/l2_normalize:0' shape=(?, 512) dtype=float32>.
The model is based on Deep Averaging Network, so the output of DNN is fed into hidden_layers:

Though I now almost understand the models, it's difficult to trace the detailed computation in the graphs (e.g., what is the concrete expression of kernel defined in tanh_layer_0 in universal-sentence-encoder-large/3?).

Can we access to TensorFlow codes that define the models in order to understand the models in detail?

@andresusanopinto
Copy link
Collaborator

@andresusanopinto andresusanopinto commented Jul 27, 2020

The code that defines the models is not available. Only the modules in tensorflow hub.

"https://tfhub.dev/google/universal-sentence-encoder/2" first takes Σ_w Embed(w) / √sentence length and feds it into several DNN layers, the output of the DNN layer is used as the output embedding.

"https://tfhub.dev/google/universal-sentence-encoder-large/3" has a transformer encoder and it uses the average pooling over all token embeddings at the last transformer layer as the output embedding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.