default task detection based on model config#4317
Conversation
| // Check if task-specific parameters are provided or if graph.pbtxt is missing | ||
| bool hasUnmatchedOptions = ::ovms::hasTaskSpecificParameters(result->unmatched()); | ||
| bool graphExists = ::ovms::graphPbtxtExists(*modelPath); | ||
|
|
| const std::optional<std::string> modelPath = result->count("model_path") ? std::make_optional(result->operator[]("model_path").as<std::string>()) : std::nullopt; | ||
| const std::optional<std::string> sourceModel = result->count("source_model") ? std::make_optional(result->operator[]("source_model").as<std::string>()) : std::nullopt; | ||
| const std::optional<std::string> modelRepositoryPath = result->count("model_repository_path") ? std::make_optional(result->operator[]("model_repository_path").as<std::string>()) : std::nullopt; | ||
|
|
| bool graphExists = ::ovms::graphPbtxtExists(*modelPath); | ||
| shouldInferTask = hasUnmatchedOptions || !graphExists; | ||
| } | ||
|
|
| @@ -0,0 +1,6 @@ | |||
| { | |||
| "architectures": ["XLMRobertaForSequenceClassification"], | |||
There was a problem hiding this comment.
I see its possible to have more than 1 architecture (its a list). How would OVMS behave? Can you add test for that behavior?
There was a problem hiding this comment.
In theory this is possible but rarely if ever happens. Added tests to cover it. First architecture will be used.
| @@ -0,0 +1,5 @@ | |||
| { | |||
| "architectures": ["UNet2DConditionModel"], | |||
There was a problem hiding this comment.
Please add these files to data field in unit test build target. If you dont do that, the unit tests will not re-run once this file change.
Example: https://github.com/openvinotoolkit/model_server/blob/main/src/BUILD#L2337-L2364
| const std::string modelPath = resolveTestModelPath("llama"); | ||
| const std::filesystem::path configJson = std::filesystem::path(modelPath) / "config.json"; | ||
| if (!std::filesystem::exists(configJson)) { | ||
| GTEST_SKIP() << "Test prerequisite missing: " << configJson.string(); |
There was a problem hiding this comment.
Why skipping? Shouldnt it error?
| cc_binary( | ||
| name = "num_streams_repro", | ||
| srcs = [ | ||
| "num_streams_repro.cpp", |
| {"XLMRobertaModel", "embeddings"}, | ||
| }; | ||
|
|
||
| std::string getEnvOrDefault(const char* envName, const std::string& defaultValue = "") { |
There was a problem hiding this comment.
these functions should be static
There was a problem hiding this comment.
why? They are already inside a namespace block?
| } | ||
| } | ||
| if (!resolvedTask.has_value()) { | ||
| throw std::logic_error("config.json architectures do not map to a supported default task"); |
There was a problem hiding this comment.
please add test for that
|
|
||
| result = std::make_unique<cxxopts::ParseResult>(options->parse(argc, argv)); | ||
|
|
||
| const bool isConfigManagementFlow = |
There was a problem hiding this comment.
could be a CLIParser method
There was a problem hiding this comment.
Pull request overview
This PR adds automatic inference of the default --task value for generative flows by inspecting HuggingFace-style config.json / model_index.json, reducing the need to pass --task explicitly when starting from --model_path or --source_model.
Changes:
- Add task inference logic in
CLIParserbased on modelarchitectures/ Diffusers pipeline_class_name, including support for local and remote (HF endpoint) config retrieval. - Introduce test fixtures (model config JSONs) and new unit tests validating task inference and config parsing behavior.
- Centralize HF-related env var names in a shared header and reduce LLM calculator log verbosity (DEBUG → TRACE).
Reviewed changes
Copilot reviewed 40 out of 40 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/cli_parser.hpp | Adds inferred task state and declares task inference helper(s). |
| src/cli_parser.cpp | Implements task inference from config/model_index and integrates it into CLI parse flow. |
| src/pull_module/hf_pull_model_module.cpp | Switches HF env var usage to named constants. |
| src/pull_module/hf_env_vars.hpp | New header providing HF env var and default endpoint constants. |
| src/pull_module/BUILD | Adds a Bazel target for the new hf_env_vars header and wires it into deps. |
| src/llm/http_llm_calculator.cc | Lowers Open/Close logging from DEBUG to TRACE. |
| src/test/task_determine_test.cpp | New parameterized unit test for determineDefaultTaskParameter() across model configs. |
| src/test/ovmsconfig_test.cpp | Updates/extends config parsing death + positive tests for inferred task behavior. |
| src/test/models_config_json/xlm_roberta/config.json | Test model HF config fixture for rerank detection. |
| src/test/models_config_json/whisper/config.json | Test model HF config fixture for speech2text detection. |
| src/test/models_config_json/trinity/config.json | Test model HF config fixture for text_generation detection. |
| src/test/models_config_json/t5_encoder/config.json | Test model HF config fixture for embeddings detection. |
| src/test/models_config_json/stable_diffusion/config.json | Test model HF config fixture for image_generation detection. |
| src/test/models_config_json/speecht5/config.json | Test model HF config fixture for text2speech detection. |
| src/test/models_config_json/seamlessm4t/config.json | Test model HF config fixture for speech2text detection. |
| src/test/models_config_json/sdxl/model_index.json | Test Diffusers model_index.json fixture for image_generation detection. |
| src/test/models_config_json/qwen3/config.json | Test model HF config fixture for text_generation detection. |
| src/test/models_config_json/Qwen3-Reranker-0.6B/config.json | Test model HF config fixture for “questionable architecture” rerank disambiguation. |
| src/test/models_config_json/Qwen3-Embedding-0.6B/config.json | Test model HF config fixture for “questionable architecture” embeddings disambiguation. |
| src/test/models_config_json/Qwen3-8B/config.json | Test model HF config fixture for “questionable architecture” ambiguity handling. |
| src/test/models_config_json/qwen3_multi_arch/config.json | Test model HF config fixture for multi-architecture task resolution. |
| src/test/models_config_json/qwen3_asr/config.json | Test model HF config fixture for speech2text detection. |
| src/test/models_config_json/qwen3_6/config.json | Test model HF config fixture for heuristic text_generation detection. |
| src/test/models_config_json/qwen2_rerank/config.json | Test model HF config fixture for rerank detection. |
| src/test/models_config_json/qwen2_embedding/config.json | Test model HF config fixture for embeddings detection. |
| src/test/models_config_json/phi3/config.json | Test model HF config fixture for text_generation detection. |
| src/test/models_config_json/parlertts/config.json | Test model HF config fixture for text2speech detection. |
| src/test/models_config_json/NullArch/config.json | Test model HF config fixture for null-architectures negative path. |
| src/test/models_config_json/no_architectures/config.json | Test model HF config fixture for missing-architectures negative path. |
| src/test/models_config_json/llama/config.json | Test model HF config fixture for text_generation detection. |
| src/test/models_config_json/lfm/config.json | Test model HF config fixture for text_generation detection. |
| src/test/models_config_json/Kokoro/config.json | Test model HF config fixture for null-architectures → text2speech special-case. |
| src/test/models_config_json/invalid_architecture/config.json | Test model HF config fixture for unsupported-architecture negative path. |
| src/test/models_config_json/gemma4/config.json | Test model HF config fixture for text_generation detection. |
| src/test/models_config_json/flux/config.json | Test model HF config fixture for image_generation detection. |
| src/test/models_config_json/flux_pipeline/model_index.json | Test Diffusers model_index.json fixture for image_generation detection. |
| src/test/models_config_json/cross_encoder/config.json | Test model HF config fixture for rerank detection. |
| src/test/models_config_json/bge/config.json | Test model HF config fixture for embeddings detection. |
| src/test/models_config_json/bge_reranker/config.json | Test model HF config fixture for rerank detection. |
| src/BUILD | Adds RapidJSON + pull-module deps to cli_parser target; adds new test and test data glob. |
| #include <fstream> | ||
| #include <filesystem> | ||
| #include <iostream> | ||
| #include <optional> | ||
| #include <stdexcept> | ||
| #include <string> | ||
| #include <map> | ||
| #include <utility> |
| std::string getTaskForQuestionableArchitecture(const std::string& architecture, const std::string& normalizedModelIdentifier) { | ||
| const auto architectureRules = questionableArchitectureTaskKeywords.find(architecture); | ||
| if (architectureRules == questionableArchitectureTaskKeywords.end()) { | ||
| return ""; | ||
| } | ||
| const auto& [defaultTask, patternRules] = architectureRules->second; | ||
| for (const auto& [task, keyword] : patternRules) { | ||
| if (normalizedModelIdentifier.find(keyword) != std::string::npos) { | ||
| return task; | ||
| } | ||
| } | ||
| return defaultTask; | ||
| } |
There was a problem hiding this comment.
currently the only questionable architecture is Qwen3ForCausalLM which is used most frequently for text generation like Qwen3-4B, Qwen3-8B etc. There is no unique pattern specific for text generation. The pattern for rerank and embed should help in proper task identification. For potential other questionable architectures it would be possible to set empty default task.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
🛠 Summary
CVS-188024
task parameter will be determined automatically based on model config.json, model_index.json and in special cases based on model name pattern.
It simplifies model deployment from HF hub.
While using --source_model - it will always create or update graph.pbtxt
While using --model_path - it will using params from graph.pbtxt if exist and no overwrite params are passed. When graph.pbtxt is missing, it will determine task automatically
In case task can't be determined like for unknows architectures - no default task will be set to be provided by the user.
🧪 Checklist
``