Skip to content

default task detection based on model config#4317

Open
dtrawins wants to merge 17 commits into
mainfrom
default-task
Open

default task detection based on model config#4317
dtrawins wants to merge 17 commits into
mainfrom
default-task

Conversation

@dtrawins

@dtrawins dtrawins commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

🛠 Summary

CVS-188024
task parameter will be determined automatically based on model config.json, model_index.json and in special cases based on model name pattern.
It simplifies model deployment from HF hub.
While using --source_model - it will always create or update graph.pbtxt
While using --model_path - it will using params from graph.pbtxt if exist and no overwrite params are passed. When graph.pbtxt is missing, it will determine task automatically
In case task can't be determined like for unknows architectures - no default task will be set to be provided by the user.

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

Comment thread src/cli_parser.cpp Outdated
// Check if task-specific parameters are provided or if graph.pbtxt is missing
bool hasUnmatchedOptions = ::ovms::hasTaskSpecificParameters(result->unmatched());
bool graphExists = ::ovms::graphPbtxtExists(*modelPath);

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment thread src/cli_parser.cpp Outdated
const std::optional<std::string> modelPath = result->count("model_path") ? std::make_optional(result->operator[]("model_path").as<std::string>()) : std::nullopt;
const std::optional<std::string> sourceModel = result->count("source_model") ? std::make_optional(result->operator[]("source_model").as<std::string>()) : std::nullopt;
const std::optional<std::string> modelRepositoryPath = result->count("model_repository_path") ? std::make_optional(result->operator[]("model_repository_path").as<std::string>()) : std::nullopt;

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment thread src/cli_parser.cpp Outdated
bool graphExists = ::ovms::graphPbtxtExists(*modelPath);
shouldInferTask = hasUnmatchedOptions || !graphExists;
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@dtrawins dtrawins requested review from dkalinowski and mzegla June 23, 2026 10:09
@@ -0,0 +1,6 @@
{
"architectures": ["XLMRobertaForSequenceClassification"],

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see its possible to have more than 1 architecture (its a list). How would OVMS behave? Can you add test for that behavior?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory this is possible but rarely if ever happens. Added tests to cover it. First architecture will be used.

@@ -0,0 +1,5 @@
{
"architectures": ["UNet2DConditionModel"],

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add these files to data field in unit test build target. If you dont do that, the unit tests will not re-run once this file change.

Example: https://github.com/openvinotoolkit/model_server/blob/main/src/BUILD#L2337-L2364

Comment thread src/test/ovmsconfig_test.cpp Outdated
const std::string modelPath = resolveTestModelPath("llama");
const std::filesystem::path configJson = std::filesystem::path(modelPath) / "config.json";
if (!std::filesystem::exists(configJson)) {
GTEST_SKIP() << "Test prerequisite missing: " << configJson.string();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why skipping? Shouldnt it error?

Comment thread src/BUILD Outdated
cc_binary(
name = "num_streams_repro",
srcs = [
"num_streams_repro.cpp",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Comment thread src/cli_parser.cpp
{"XLMRobertaModel", "embeddings"},
};

std::string getEnvOrDefault(const char* envName, const std::string& defaultValue = "") {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these functions should be static

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? They are already inside a namespace block?

Comment thread src/cli_parser.cpp
}
}
if (!resolvedTask.has_value()) {
throw std::logic_error("config.json architectures do not map to a supported default task");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add test for that

Comment thread src/cli_parser.cpp

result = std::make_unique<cxxopts::ParseResult>(options->parse(argc, argv));

const bool isConfigManagementFlow =

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be a CLIParser method

@dtrawins dtrawins marked this pull request as ready for review June 24, 2026 14:51
Copilot AI review requested due to automatic review settings June 24, 2026 14:51

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds automatic inference of the default --task value for generative flows by inspecting HuggingFace-style config.json / model_index.json, reducing the need to pass --task explicitly when starting from --model_path or --source_model.

Changes:

  • Add task inference logic in CLIParser based on model architectures / Diffusers pipeline _class_name, including support for local and remote (HF endpoint) config retrieval.
  • Introduce test fixtures (model config JSONs) and new unit tests validating task inference and config parsing behavior.
  • Centralize HF-related env var names in a shared header and reduce LLM calculator log verbosity (DEBUG → TRACE).

Reviewed changes

Copilot reviewed 40 out of 40 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/cli_parser.hpp Adds inferred task state and declares task inference helper(s).
src/cli_parser.cpp Implements task inference from config/model_index and integrates it into CLI parse flow.
src/pull_module/hf_pull_model_module.cpp Switches HF env var usage to named constants.
src/pull_module/hf_env_vars.hpp New header providing HF env var and default endpoint constants.
src/pull_module/BUILD Adds a Bazel target for the new hf_env_vars header and wires it into deps.
src/llm/http_llm_calculator.cc Lowers Open/Close logging from DEBUG to TRACE.
src/test/task_determine_test.cpp New parameterized unit test for determineDefaultTaskParameter() across model configs.
src/test/ovmsconfig_test.cpp Updates/extends config parsing death + positive tests for inferred task behavior.
src/test/models_config_json/xlm_roberta/config.json Test model HF config fixture for rerank detection.
src/test/models_config_json/whisper/config.json Test model HF config fixture for speech2text detection.
src/test/models_config_json/trinity/config.json Test model HF config fixture for text_generation detection.
src/test/models_config_json/t5_encoder/config.json Test model HF config fixture for embeddings detection.
src/test/models_config_json/stable_diffusion/config.json Test model HF config fixture for image_generation detection.
src/test/models_config_json/speecht5/config.json Test model HF config fixture for text2speech detection.
src/test/models_config_json/seamlessm4t/config.json Test model HF config fixture for speech2text detection.
src/test/models_config_json/sdxl/model_index.json Test Diffusers model_index.json fixture for image_generation detection.
src/test/models_config_json/qwen3/config.json Test model HF config fixture for text_generation detection.
src/test/models_config_json/Qwen3-Reranker-0.6B/config.json Test model HF config fixture for “questionable architecture” rerank disambiguation.
src/test/models_config_json/Qwen3-Embedding-0.6B/config.json Test model HF config fixture for “questionable architecture” embeddings disambiguation.
src/test/models_config_json/Qwen3-8B/config.json Test model HF config fixture for “questionable architecture” ambiguity handling.
src/test/models_config_json/qwen3_multi_arch/config.json Test model HF config fixture for multi-architecture task resolution.
src/test/models_config_json/qwen3_asr/config.json Test model HF config fixture for speech2text detection.
src/test/models_config_json/qwen3_6/config.json Test model HF config fixture for heuristic text_generation detection.
src/test/models_config_json/qwen2_rerank/config.json Test model HF config fixture for rerank detection.
src/test/models_config_json/qwen2_embedding/config.json Test model HF config fixture for embeddings detection.
src/test/models_config_json/phi3/config.json Test model HF config fixture for text_generation detection.
src/test/models_config_json/parlertts/config.json Test model HF config fixture for text2speech detection.
src/test/models_config_json/NullArch/config.json Test model HF config fixture for null-architectures negative path.
src/test/models_config_json/no_architectures/config.json Test model HF config fixture for missing-architectures negative path.
src/test/models_config_json/llama/config.json Test model HF config fixture for text_generation detection.
src/test/models_config_json/lfm/config.json Test model HF config fixture for text_generation detection.
src/test/models_config_json/Kokoro/config.json Test model HF config fixture for null-architectures → text2speech special-case.
src/test/models_config_json/invalid_architecture/config.json Test model HF config fixture for unsupported-architecture negative path.
src/test/models_config_json/gemma4/config.json Test model HF config fixture for text_generation detection.
src/test/models_config_json/flux/config.json Test model HF config fixture for image_generation detection.
src/test/models_config_json/flux_pipeline/model_index.json Test Diffusers model_index.json fixture for image_generation detection.
src/test/models_config_json/cross_encoder/config.json Test model HF config fixture for rerank detection.
src/test/models_config_json/bge/config.json Test model HF config fixture for embeddings detection.
src/test/models_config_json/bge_reranker/config.json Test model HF config fixture for rerank detection.
src/BUILD Adds RapidJSON + pull-module deps to cli_parser target; adds new test and test data glob.

Comment thread src/cli_parser.cpp
Comment on lines +18 to 25
#include <fstream>
#include <filesystem>
#include <iostream>
#include <optional>
#include <stdexcept>
#include <string>
#include <map>
#include <utility>
Comment thread src/cli_parser.cpp
Comment on lines +138 to +150
std::string getTaskForQuestionableArchitecture(const std::string& architecture, const std::string& normalizedModelIdentifier) {
const auto architectureRules = questionableArchitectureTaskKeywords.find(architecture);
if (architectureRules == questionableArchitectureTaskKeywords.end()) {
return "";
}
const auto& [defaultTask, patternRules] = architectureRules->second;
for (const auto& [task, keyword] : patternRules) {
if (normalizedModelIdentifier.find(keyword) != std::string::npos) {
return task;
}
}
return defaultTask;
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently the only questionable architecture is Qwen3ForCausalLM which is used most frequently for text generation like Qwen3-4B, Qwen3-8B etc. There is no unique pattern specific for text generation. The pattern for rerank and embed should help in proper task identification. For potential other questionable architectures it would be possible to set empty default task.

Comment thread src/cli_parser.cpp Outdated
Comment thread src/cli_parser.cpp
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants