support json serialize all kind's of huggingface pipelines inputs/outputs #692
support json serialize all kind's of huggingface pipelines inputs/outputs #692adriangonz merged 3 commits intoSeldonIO:masterfrom
Conversation
57178f2 to
d548ea6
Compare
adriangonz
left a comment
There was a problem hiding this comment.
Hey @pepesi ,
Nice spot! Thanks a lot for contributing this one.
Changes look good to me! 👍
Before merging though, would you be able to provide a test case that covers the issue that this PR fixes?
|
I fixed the lint error and add some tests for the NumpyEncoder, but I can't run the test successfully in my local environment because of the error |
26c1562 to
92e3b72
Compare
Tests passed on my Desktop |
92e3b72 to
d02db95
Compare
|
After test more huggingface models, I think the NumpyEncoder should renamed to |
d3e97e8 to
42e1df7
Compare
|
A small script tested passed in my local, it seems ok now, but I won't add it to tests, because I think it's so heavy |
42e1df7 to
553ec92
Compare
adriangonz
left a comment
There was a problem hiding this comment.
Hey @pepesi ,
Massive thanks for the effort you're putting to make the encoder for HuggingFace models fully complete! It's certainly not an easy task! 🚀 💪
|
|
a17834e to
120301e
Compare
|
WIP |
bb07b1e to
5814182
Compare
5814182 to
6b480bf
Compare
601af5c to
02a8314
Compare
84f6be5 to
a3176e7
Compare
|
What changes in this PR I'm using Seldon and MLServer in a project, that is aimed to fast deploy and experience any Hugginface model; when I test with some visual-related models, I go some JSON serialize error like this #656, At first, I just want fix the JSON serialize error. But when I get further, I get more trouble with inputs; changes:
May not be compatible A remedy here, change the is_sinle default value to None, then change NumpyCodec`s default behavior, if is_single is None, set is_single to True; but which may confusing @adriangonz any advise here? |
|
Hey @pepesi , Thanks a lot for the time you've put on this PR - I know it hasn't been an easy journey! It's grown quite massively, so it will take us a bit of time to review it. Particularly, considering that there are breaking changes to some of the existing functionality. We will do our best though! |
Dockerfile
Outdated
| apt-get -y --no-install-recommends install \ | ||
| unattended-upgrades \ | ||
| libgomp1 libgl1-mesa-dev libglib2.0-0 build-essential && \ | ||
| libgomp1 libgl1-mesa-dev libglib2.0-0 build-essential ffmpeg && \ |
There was a problem hiding this comment.
We have now moved to a different base image for Docker, but it should already install ffmpeg (although in a diff way). Therefore, feel free to pick up master's version of the Dockerfile when rebasing 👍
mlserver/batch_processing.py
Outdated
| if request_input.parameters is not None: | ||
| new_input._parameters.update(request_input.parameters.dict()) |
There was a problem hiding this comment.
Wondering about this change and exact motivation behind. It is true that currently there is no option to set parameters on the inputs and request as a whole but I am worried that modifying triton's internal parameters (as indicated by _) could lead to unexpected breaking changes in the future.
adriangonz
left a comment
There was a problem hiding this comment.
Hey @pepesi ,
I've finally found the time to make a first pass at this one.
Before anything else, I have to say this PR is really impressive! The attention to detail shown, and the thoroughness, is incredible! Massive thanks for spending the time on these changes! 🚀
I've added a couple comments below. Would be great to hear your thoughts on those ones. Besides that, I'm not 100% convinced on the introduction of the is_single parameter and wanted to hear your reasoning behind why it is required.
My view is that codecs should always operate with lists of things (i.e. batches), and then is up to the runtime to deal with these. This simplifies other features like batching, as well as simplifies the code, which doesn't need to deal with multiple types (i.e. type T or list of type T).
| if JSONCodec.can_encode(data): | ||
| return JSONCodec | ||
| if ImageCodec.can_encode(data): | ||
| return ImageCodec | ||
| if ConversationCodec.can_encode(data): | ||
| return ConversationCodec | ||
| return find_input_codec_by_payload(data) |
There was a problem hiding this comment.
If we decorate them with @register_input_codec or @register_request_codec we should then be able to find them with find_input_codec(payload=data) (as in, they'll go into the general codec registry, which is not a bad thing).
There was a problem hiding this comment.
Some data may encode by multiple codecs, but there is no priority with codecs, mlserver.codecs._find_codec_by_payload just return the first codecs matched; I want to find codecs in order with priority, and that's the reason;
If my understanding of the encoder registry is wrong, please let me know
There was a problem hiding this comment.
Oh, I see. Ok, that makes sense. In that case, it may be worth dropping a comment on that method to briefly explain that reasoning (i.e. and avoid people changing that in the future).
|
|
||
| @register_input_codec | ||
| class ImageCodec(InputCodecTy): | ||
| ContentType = CONTENT_TYPE_IMAGE |
There was a problem hiding this comment.
Is there any reason why we need separate constants for these values? As in, instead of just keeping the actual value here and referring to ImageCodec.ContentType everywhere else?
|
|
||
|
|
||
| @register_input_codec | ||
| class ImageCodec(InputCodecTy): |
There was a problem hiding this comment.
I was thinking that, to help future maintainers, we should split these codecs into their own files. As in, instead of keeping them all under mlserver_huggingface/codecs.py, we should create a new mlserver_huggingface/codecs package (i.e. folder), and then have:
mlserver_huggingface/codecs/image.pymlserver_huggingface/codecs/conversation.pymlserver_huggingface/codecs/json.py- ....
Each of these files could also have any codec-specific helpers (like get_img_bytes, which could live in mlserver_huggingface/codecs/image.py).
This will also help with the code review 👍
There was a problem hiding this comment.
Your suggestion is right. I will revise it according to that
mlserver/batch_processing.py
Outdated
| if request_input.parameters is not None: | ||
| new_input._parameters.update(request_input.parameters.dict()) |
There was a problem hiding this comment.
Good catch!
We initially considered leveraging the _parameters field, but then decided not to, in order to avoid using Tritonclient's internal interface. Instead of that, the general advise when using mlserver infer is to just configure the content_type through the model's metadata (i.e. the model-settings.json file).
Would be great to hear your thoughts behind this change though.
|
Firstly I agree with the opinion, codecs should always operate with lists of things (i.e. batches), and then is up to the runtime to deal with these. Before that, I didn't realize that it was a principle to follow. So I think I will remove And then, Let me explain why I add For due to adding the parameter is_single, triton httpclient request need to pass it to the server also, that why I modify InferInput._parameters. as @RafalSkolasinski said, it may cause breaking changes. |
|
Got it! Thanks for that explanation @pepesi . In that case, probably best is to handle the Looking forward to next batch of changes! Keep up the great work! We are getting closer. 🚀 |
0a99402 to
766a0cd
Compare
766a0cd to
adc1424
Compare
There was a problem hiding this comment.
This looks great @pepesi ! Thanks a lot for making those changes.
I know this has been a long journey - what started as a fix for numpy, is now a full-blown update to the HF runtime - but it's such a great addition! Massive thanks for all the effort that has gone behind this contribution. 🚀
I think the changes looks great. I've added a minor question, but I don't think it's necessarily a blocker. It would be great to have your thoughts on that one. Besides that, once tests are green, this should be good to go ahead! 👍
adc1424 to
c10674b
Compare
|
Thanks a lot for making those changes @pepesi. Once again, massive thanks for all the effort you've put into this one. PR looks great and tests are green, so this should be good to go! 🚀 |
huggingface_runtime output JSON serializer does not support NumPy basic datatypes when the data is a dict value