support json serialize all kind's of huggingface pipelines inputs/outputs by pepesi · Pull Request #692 · SeldonIO/MLServer

pepesi · 2022-08-17T08:09:17Z

huggingface_runtime output JSON serializer does not support NumPy basic datatypes when the data is a dict value

adriangonz

Nice spot! Thanks a lot for contributing this one.

Changes look good to me! 👍

Before merging though, would you be able to provide a test case that covers the issue that this PR fixes?

pepesi · 2022-08-17T09:56:20Z

I fixed the lint error and add some tests for the NumpyEncoder, but I can't run the test successfully in my local environment because of the error ImportError: cannot import name 'deepspeed_reinit' from 'transformers.deepspeed'. Can I run tests on Github Actions?

pepesi · 2022-08-17T10:41:38Z

I fixed the lint error and add some tests for the NumpyEncoder, but I can't run the test successfully in my local environment because of the error ImportError: cannot import name 'deepspeed_reinit' from 'transformers.deepspeed'. Can I run tests on Github Actions?

Tests passed on my Desktop

pepesi · 2022-08-18T09:40:03Z

After test more huggingface models, I think the NumpyEncoder should renamed to HuggingfaceOutputJSONEncoder. Because huggingface pipeline output is not only numpy datatypes but also Pillow's Image, before more tests run, i'm not sure how many python types exists in pipeline's outputs.

pepesi · 2022-08-24T02:14:19Z

A small script tested passed in my local, it seems ok now, but I won't add it to tests, because I think it's so heavy

from transformers.pipelines import pipeline
from transformers import Conversation
import json
import numpy as np
from PIL import Image
import io
import base64


class CommonJSONEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        if isinstance(obj, (np.float_, np.float16, np.float32, np.float64)):
            return float(str(obj))
        if isinstance(obj, (np.int_, np.int8, np.int16, np.int32, np.int64)):
            return int(obj)
        if isinstance(obj, Image.Image):
            buf = io.BytesIO()
            obj.save(buf, format="png")
            return base64.b64encode(buf.getvalue()).decode()
        if isinstance(obj, Conversation):
            return {
                'uuid': str(obj.uuid),
                'past_user_inputs': obj.past_user_inputs,
                'generated_responses': obj.generated_responses,
                'new_user_input': obj.new_user_input
            }
        return json.JSONEncoder.default(self, obj)


sumarytext = "The tower is 324 metres (1,063 ft) tall, about\
the same height as an 81-storey building, and the tallest structure \
in Paris. Its base is square, measuring 125 metres (410 ft) on each side.\
During its construction, the Eiffel Tower surpassed the\
Washington Monument to become the tallest man-made structure\
in the world, a title it held for 41 years until the Chrysler\
Building in New York City was finished in 1930. It was the\
first structure to reach a height of 300 metres. Due to the\
addition of a broadcasting aerial at the top of the tower in\
1957, it is now taller than the Chrysler Building by 5.2 metres\
(17 ft). Excluding transmitters, the Eiffel Tower is the second\
tallest free-standing structure in France after the Millau Viaduct."


ALL_TASKS_TESTS = {
    "audio-classification": [
        {"args": (), "kwargs": {"inputs": "fixtures/audio.mp3"}}
    ],
    "automatic-speech-recognition": [
        {"args": (), "kwargs": {"inputs": "fixtures/audio.mp3"}}
    ],
    "feature-extraction": [
        {
            "args": (),
            "kwargs": {
                "inputs": "this pure text",
            },
        }
    ],
    "text-classification": [
        {
            "args": (
                "A greeting brings the care of the heart,\
        a blessing brings the care of the body, and a short\
        message brings the love. May you be lucky and happy,\
        and your life is sweet as honey, and your career is \
        successful and official!",
            ),
            "kwargs": {},
        }
    ],
    "token-classification": [
        {
            "args": (),
            "kwargs": {"inputs": "Hello I'm Omar and I live in Zürich."}
        }
    ],
    "question-answering": [
        {
            "args": (),
            "kwargs": {
                "question": "what's her job?",
                "context": "Her name is Lily, she is a singer",
            },
        }
    ],
    "table-question-answering": [
        {
            "args": (),
            "kwargs": {
                "table": {
                    "actors": [
                        "brad pitt",
                        "leonardo di caprio",
                        "george clooney"
                    ],
                    "age": ["56", "45", "59"],
                    "number of movies": ["87", "53", "69"],
                    "date of birth": [
                        "7 february 1967",
                        "10 june 1996",
                        "28 november 1967",
                    ],
                },
                "query": "when brad pitt born?",
            },
        }
    ],
    "visual-question-answering": [
        {
            "args": (),
            "kwargs": {
                "image": "fixtures/dogs.jpg",
                "question": "how many dogs here?"
            },
        }
    ],
    "fill-mask": [
        {
            "args": (),
            "kwargs": {
                "inputs": "i am come from <mask>",
            },
        }
    ],
    "summarization": [{"args": (sumarytext,), "kwargs": {}}],
    "translation_en_to_de": [
        {"args": ("My name is Sarah and I live in London",), "kwargs": {}}
    ],
    "text2text-generation": [
        {
            "args": (
                "question: What is 42 ? context: 42 is the answer to life,\
            the universe and everything",
            ),
            "kwargs": {},
        },
        {
            "args": ("translate from English to French: I'm very happy",),
            "kwargs": {}
        },
    ],
    "text-generation": [
        {"args": ("My name is Tylor Swift and I am",), "kwargs": {}}
    ],
    "zero-shot-classification": [
        {
            "args": (),
            "kwargs": {
                "sequences": "Last week I upgraded my iOS version and ever\
            since then my phone has been overheating whenever I\
            use your app.",
                "candidate_labels": [
                    "mobile",
                    "website",
                    "billing",
                    "account",
                    "access",
                ],
                "multi_class": False,
            },
        }
    ],
    "zero-shot-image-classification": [
        {
            "args": (),
            "kwargs": {
                "images": "fixtures/dogs.jpg",
                "candidate_labels": ["dogs", "cats", "tigers"],
            },
        }
    ],
    "conversational": [
        {
            "args": (),
            "kwargs": {
                "conversations": [
                    Conversation("Hello"),
                    Conversation("how are you"),
                    Conversation("where are you from"),
                ],
            },
        }
    ],
    "image-classification": [
        {
            "args": (),
            "kwargs": {
                "images": "fixtures/dogs.jpg",
            },
        }
    ],
    "image-segmentation": [
        {
            "args": (),
            "kwargs": {
                "inputs": "fixtures/dogs.jpg",
            },
        }
    ],
    "object-detection": [
        {
            "args": (),
            "kwargs": {
                "inputs": "fixtures/dogs.jpg",
            },
        }
    ],
}


all_types = {}
outs = []


def inspect_types(output):
    if isinstance(output, dict):
        for k, v in output.items():
            inspect_types(k)
            inspect_types(v)
    elif isinstance(output, (list, tuple)):
        for el in output:
            inspect_types(el)
    elif isinstance(output, (int, str, float)):
        all_types[type(output)] = 1
    else:
        all_types[type(output)] = 1


for task, argslist in ALL_TASKS_TESTS.items():
    print("-" * 80)
    print(task)
    p = pipeline(task)
    for arg in argslist:
        output = p(*arg["args"], **arg["kwargs"])
        inspect_types(output)
        outs.append(output)


print("all kinds of datatypes:")
print(list(all_types.keys()))


print(json.dumps(outs, cls=CommonJSONEncoder))

adriangonz

Hey @pepesi ,

Massive thanks for the effort you're putting to make the encoder for HuggingFace models fully complete! It's certainly not an easy task! 🚀 💪

runtimes/huggingface/mlserver_huggingface/common.py

pepesi · 2022-08-25T03:19:21Z

timm is the requirement for some visualization-related Huggingface tasks, Pillow is one of the dependencies for it;

ffmpeg is the system requirement for some audio-related Huggingface tasks, I found there is no place to install single apt package for a specific runtime, So I add it to MLServer's basic runtime

torch-scatter for task table-question-anwser

pepesi · 2022-08-29T02:15:35Z

WIP

pepesi · 2022-10-14T02:28:58Z

What changes in this PR

I'm using Seldon and MLServer in a project, that is aimed to fast deploy and experience any Hugginface model; when I test with some visual-related models, I go some JSON serialize error like this #656, At first, I just want fix the JSON serialize error. But when I get further, I get more trouble with inputs;

changes:

add is_single filed in parameters to support decode request input as a single value, this field is now affected StringCodec NumpyCodec Base64Codec;
add HuggingfaceRuntimeCodec, which can auto decode the inputs as pipelines args

May not be compatible
NumpyCodec always converts the inputs to a single np.ndarray before, now if no parameter is_single provided, NumpyCodec would decodes the inputs as an list.

A remedy here, change the is_sinle default value to None, then change NumpyCodec`s default behavior, if is_single is None, set is_single to True; but which may confusing

@adriangonz any advise here?

adriangonz · 2022-10-14T14:13:11Z

Hey @pepesi ,

Thanks a lot for the time you've put on this PR - I know it hasn't been an easy journey!

It's grown quite massively, so it will take us a bit of time to review it. Particularly, considering that there are breaking changes to some of the existing functionality. We will do our best though!

adriangonz · 2022-11-03T11:00:44Z

Dockerfile

    apt-get -y --no-install-recommends install \
        unattended-upgrades \
-        libgomp1 libgl1-mesa-dev libglib2.0-0 build-essential && \
+        libgomp1 libgl1-mesa-dev libglib2.0-0 build-essential ffmpeg && \


We have now moved to a different base image for Docker, but it should already install ffmpeg (although in a diff way). Therefore, feel free to pick up master's version of the Dockerfile when rebasing 👍

RafalSkolasinski · 2022-11-03T11:30:18Z

mlserver/batch_processing.py

+        if request_input.parameters is not None:
+            new_input._parameters.update(request_input.parameters.dict())


Wondering about this change and exact motivation behind. It is true that currently there is no option to set parameters on the inputs and request as a whole but I am worried that modifying triton's internal parameters (as indicated by _) could lead to unexpected breaking changes in the future.

adriangonz

Hey @pepesi ,

I've finally found the time to make a first pass at this one.

Before anything else, I have to say this PR is really impressive! The attention to detail shown, and the thoroughness, is incredible! Massive thanks for spending the time on these changes! 🚀

I've added a couple comments below. Would be great to hear your thoughts on those ones. Besides that, I'm not 100% convinced on the introduction of the is_single parameter and wanted to hear your reasoning behind why it is required.

My view is that codecs should always operate with lists of things (i.e. batches), and then is up to the runtime to deal with these. This simplifies other features like batching, as well as simplifies the code, which doesn't need to deal with multiple types (i.e. type T or list of type T).

mlserver/codecs/numpy.py

mlserver/codecs/utils.py

adriangonz · 2022-11-03T11:15:41Z

runtimes/huggingface/mlserver_huggingface/codecs.py

+    if JSONCodec.can_encode(data):
+        return JSONCodec
+    if ImageCodec.can_encode(data):
+        return ImageCodec
+    if ConversationCodec.can_encode(data):
+        return ConversationCodec
+    return find_input_codec_by_payload(data)


If we decorate them with @register_input_codec or @register_request_codec we should then be able to find them with find_input_codec(payload=data) (as in, they'll go into the general codec registry, which is not a bad thing).

Some data may encode by multiple codecs, but there is no priority with codecs, mlserver.codecs._find_codec_by_payload just return the first codecs matched; I want to find codecs in order with priority, and that's the reason;

If my understanding of the encoder registry is wrong, please let me know

Oh, I see. Ok, that makes sense. In that case, it may be worth dropping a comment on that method to briefly explain that reasoning (i.e. and avoid people changing that in the future).

adriangonz · 2022-11-03T11:16:25Z

runtimes/huggingface/mlserver_huggingface/codecs.py

+
+@register_input_codec
+class ImageCodec(InputCodecTy):
+    ContentType = CONTENT_TYPE_IMAGE


Is there any reason why we need separate constants for these values? As in, instead of just keeping the actual value here and referring to ImageCodec.ContentType everywhere else?

adriangonz · 2022-11-03T11:21:03Z

runtimes/huggingface/mlserver_huggingface/codecs.py

+
+
+@register_input_codec
+class ImageCodec(InputCodecTy):


I was thinking that, to help future maintainers, we should split these codecs into their own files. As in, instead of keeping them all under mlserver_huggingface/codecs.py, we should create a new mlserver_huggingface/codecs package (i.e. folder), and then have:

mlserver_huggingface/codecs/image.py

mlserver_huggingface/codecs/conversation.py

mlserver_huggingface/codecs/json.py

....

Each of these files could also have any codec-specific helpers (like get_img_bytes, which could live in mlserver_huggingface/codecs/image.py).

This will also help with the code review 👍

Your suggestion is right. I will revise it according to that

runtimes/huggingface/mlserver_huggingface/common.py

runtimes/huggingface/mlserver_huggingface/runtime.py

adriangonz · 2022-11-03T11:31:20Z

mlserver/batch_processing.py

+        if request_input.parameters is not None:
+            new_input._parameters.update(request_input.parameters.dict())


Good catch!

We initially considered leveraging the _parameters field, but then decided not to, in order to avoid using Tritonclient's internal interface. Instead of that, the general advise when using mlserver infer is to just configure the content_type through the model's metadata (i.e. the model-settings.json file).

Would be great to hear your thoughts behind this change though.

runtimes/huggingface/mlserver_huggingface/codecs.py

pepesi · 2022-11-04T06:23:43Z

Firstly I agree with the opinion, codecs should always operate with lists of things (i.e. batches), and then is up to the runtime to deal with these. Before that, I didn't realize that it was a principle to follow. So I think I will remove is_single next

And then, Let me explain why I add is_single before.

For visual-question-answering and table-question-answering task pipelines, they support args like this pipeline(image=Image.Image, question=str), but codecs would decode the request data like this [Image.Image] [question] (a list). So I add the parameter to support decode data as a single value, which would make the codecs decode the request data to like Image.Image str(a single value).

due to adding the parameter is_single, triton httpclient request need to pass it to the server also, that why I modify InferInput._parameters. as @RafalSkolasinski said, it may cause breaking changes.

adriangonz · 2022-11-09T18:09:08Z

Got it! Thanks for that explanation @pepesi .

In that case, probably best is to handle the list -> single element logic within the HugginFace runtime itself. Actually, thinking about that, we may need to do that to support adaptive batching anyway (i.e. handle the case where an incoming request has more than a single data point). That's totally outside the scope of this PR though! Don't worry about that use case!

Looking forward to next batch of changes! Keep up the great work! We are getting closer. 🚀

adriangonz

This looks great @pepesi ! Thanks a lot for making those changes.

I know this has been a long journey - what started as a fix for numpy, is now a full-blown update to the HF runtime - but it's such a great addition! Massive thanks for all the effort that has gone behind this contribution. 🚀

I think the changes looks great. I've added a minor question, but I don't think it's necessarily a blocker. It would be great to have your thoughts on that one. Besides that, once tests are green, this should be good to go ahead! 👍

runtimes/huggingface/tests/test_runtime.py

adriangonz · 2022-12-09T09:38:29Z

Thanks a lot for making those changes @pepesi. Once again, massive thanks for all the effort you've put into this one.

PR looks great and tests are green, so this should be good to go! 🚀

pepesi mentioned this pull request Aug 17, 2022

Convert numpy.float32 output values to float #656

Closed

pepesi force-pushed the bugfix-numpyencoder branch from 57178f2 to d548ea6 Compare August 17, 2022 08:17

adriangonz reviewed Aug 17, 2022

View reviewed changes

pepesi marked this pull request as draft August 17, 2022 09:14

pepesi marked this pull request as ready for review August 17, 2022 09:51

pepesi force-pushed the bugfix-numpyencoder branch from 26c1562 to 92e3b72 Compare August 17, 2022 10:38

pepesi force-pushed the bugfix-numpyencoder branch from 92e3b72 to d02db95 Compare August 17, 2022 13:55

pepesi marked this pull request as draft August 18, 2022 09:54

pepesi force-pushed the bugfix-numpyencoder branch from d3e97e8 to 42e1df7 Compare August 24, 2022 01:59

pepesi changed the title ~~fix NumpyEncoder not supported types~~ support json serialize all kind's of huggingface pipelines outputs Aug 24, 2022

pepesi marked this pull request as ready for review August 24, 2022 02:14

pepesi force-pushed the bugfix-numpyencoder branch from 42e1df7 to 553ec92 Compare August 24, 2022 02:18

adriangonz requested a review from axsaucedo August 24, 2022 08:17

adriangonz reviewed Aug 24, 2022

View reviewed changes

runtimes/huggingface/mlserver_huggingface/common.py Outdated Show resolved Hide resolved

pepesi force-pushed the bugfix-numpyencoder branch from a17834e to 120301e Compare August 25, 2022 06:14

pepesi marked this pull request as draft August 29, 2022 02:14

pepesi force-pushed the bugfix-numpyencoder branch from bb07b1e to 5814182 Compare September 22, 2022 05:09

pepesi force-pushed the bugfix-numpyencoder branch from 5814182 to 6b480bf Compare October 10, 2022 01:42

pepesi marked this pull request as ready for review October 10, 2022 01:43

adriangonz mentioned this pull request Oct 10, 2022

HuggingFace speech models not supported #758

Closed

pepesi force-pushed the bugfix-numpyencoder branch from 601af5c to 02a8314 Compare October 11, 2022 05:41

pepesi changed the title ~~support json serialize all kind's of huggingface pipelines outputs~~ support json serialize all kind's of huggingface pipelines inputs/outputs Oct 11, 2022

pepesi force-pushed the bugfix-numpyencoder branch 2 times, most recently from 84f6be5 to a3176e7 Compare October 14, 2022 01:40

adriangonz reviewed Nov 3, 2022

View reviewed changes

RafalSkolasinski reviewed Nov 3, 2022

View reviewed changes

adriangonz suggested changes Nov 3, 2022

View reviewed changes

pepesi marked this pull request as draft November 17, 2022 06:20

pepesi force-pushed the bugfix-numpyencoder branch 2 times, most recently from 0a99402 to 766a0cd Compare December 1, 2022 08:34

pepesi marked this pull request as ready for review December 1, 2022 08:35

pepesi requested review from adriangonz and removed request for axsaucedo December 1, 2022 08:36

pepesi force-pushed the bugfix-numpyencoder branch from 766a0cd to adc1424 Compare December 1, 2022 08:48

adriangonz approved these changes Dec 7, 2022

View reviewed changes

runtimes/huggingface/tests/test_runtime.py Outdated Show resolved Hide resolved

pepesi added 3 commits December 8, 2022 20:49

huggingface runtime refactor

549ac1c

add tests

889611e

use HuggingfaceListJSONCodec as default response codec

c10674b

pepesi force-pushed the bugfix-numpyencoder branch from adc1424 to c10674b Compare December 8, 2022 13:05

adriangonz merged commit dae6d0c into SeldonIO:master Dec 9, 2022

ajsalow mentioned this pull request Dec 16, 2022

Hugging face token-classification output causes non JSON serializable error #904

Closed

		if request_input.parameters is not None:
		new_input._parameters.update(request_input.parameters.dict())

Conversation

pepesi commented Aug 17, 2022

Uh oh!

adriangonz left a comment

Choose a reason for hiding this comment

Uh oh!

pepesi commented Aug 17, 2022

Uh oh!

pepesi commented Aug 17, 2022

Uh oh!

pepesi commented Aug 18, 2022

Uh oh!

pepesi commented Aug 24, 2022

Uh oh!

adriangonz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pepesi commented Aug 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pepesi commented Aug 29, 2022

Uh oh!

pepesi commented Oct 14, 2022

Uh oh!

adriangonz commented Oct 14, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adriangonz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pepesi commented Nov 4, 2022

Uh oh!

adriangonz commented Nov 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adriangonz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adriangonz commented Dec 9, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pepesi commented Aug 25, 2022 •

edited

Loading

adriangonz commented Nov 9, 2022 •

edited

Loading

adriangonz left a comment •

edited

Loading