forked from openpmix/prrte
-
Notifications
You must be signed in to change notification settings - Fork 3
Sync ompi main with master at bff13fb #102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jsquyres
merged 35 commits into
open-mpi:ompi_main
from
hppritcha:sync_ompi_main_with_master_bff13fb
Nov 14, 2025
Merged
Sync ompi main with master at bff13fb #102
jsquyres
merged 35 commits into
open-mpi:ompi_main
from
hppritcha:sync_ompi_main_with_master_bff13fb
Nov 14, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Matthew Whitlock <mwhitlo@sandia.gov>
Fork Sync: Update from parent repository
This error is also displayed in cases where files or directories do not exist and is not only caused by missing permissions. Signed-off-by: Christoph Niethammer <niethammer@hlrs.de>
Standard has been updated to allow the log and tool_connected upcalls to return status codes. Update here to support them Signed-off-by: Ralph Castain <rhc@pmix.org>
Need to do it based on PMIx capabilities Signed-off-by: Ralph Castain <rhc@pmix.org>
Fork Sync: Update from parent repository
There was an initial thought that we would generate some uber-list of defined flags for capabilities across all PMIx versions, and then indicate which ones were supported by this particular version by OR'ing them together into some more general value. This has proven unworkable as you get into a giant game of bit-counting to create the definitions. Instead, we only define flags that this specific version supports - thus, the value of the individual flag is irrelevant and no general value is required. Signed-off-by: Ralph Castain <rhc@pmix.org>
Signed-off-by: Ralph Castain <rhc@pmix.org>
Fork Sync: Update from parent repository
Allow the target node list to follow the ordering inside a provided hostfile and dash-host specification by not assigning a bookmark based on the DVM job. Add support for missing default-hostfile cmd line option We have the support for the user to specify it via MCA param, but somehow we lost the integration to pick it up off of the prte and prterun cmd lines. Signed-off-by: Ralph Castain <rhc@pmix.org>
Fork Sync: Update from parent repository
Need to clear the character arrays between calculating binding location for each proc as snprintf doesn't terminate the string. Signed-off-by: Ralph Castain <rhc@pmix.org>
Fork Sync: Update from parent repository
PPR placement policy requests are uniform - i.e., the specified number of procs must be placed on every object of the directed type. When the request includes a cpu/proc directive, then there must also be enough CPUs to meet the request on every object. When that isn't the case, then we need to error out and not just place the proc without binding it. Signed-off-by: Ralph Castain <rhc@pmix.org>
Fork Sync: Update from parent repository
If we are using the seq or rankfile mapper and have multiple apps on the cmd line, then allow the mappers to compute their own num procs if one or more are not given. Signed-off-by: Ralph Castain <rhc@pmix.org>
The empty nodes were not properly being added to the list of names to be used by the mapper. Signed-off-by: Ralph Castain <rhc@pmix.org>
Per note in the OMPI project, at least one compiler family is removing the "sprintf" function. Replace all uses of that function with the safer "snprintf" version. Signed-off-by: Ralph Castain <rhc@pmix.org>
Fork Sync: Update from parent repository
When a timeout is specified and the primary job is timed-out, then we need to ensure we also report and kill any child jobs it started. This includes reporting any requested stack traces. Also all inheritance of output directives like tag and timestamp. Signed-off-by: Ralph Castain <rhc@pmix.org>
Port the "launching-apps" section from the OMPI docs over to PRRTE since it specifically deals with prterun usage. Add some updates about gridengine support courtesy of open-mpi/ompi#13450. Signed-off-by: Ralph Castain <rhc@pmix.org>
Fork Sync: Update from parent repository
Ensure the IDs provided are interpreted as core and not hwt values. Properly error out when an ID is provided that does not exist on the node, and note that this usually is because the IDs address hwt's while we are treating cores as CPUs. Signed-off-by: Ralph Castain <rhc@pmix.org>
Fork Sync: Update from parent repository
…embership The PMIx_Group_construct API itself is not order sensitive in the provided proc array - i.e., we ignore that order when executing the operation. This allows each caller to provide the proc array in arbitrary order - which is advantageous for most libraries. However, some users may need us to return a specific order of the procs in the final membership. Allow them to specify the order in a new attribute, This can be specified by individual processes or by namespace (with a wildcard rank). If multiple participants provide this attribute, then the values must match - i.e., the desired final membership order must be identical. Signed-off-by: Ralph Castain <rhc@pmix.org>
Print the cpulist itself, and not its address Signed-off-by: Ralph Castain <rhc@pmix.org>
Fork Sync: Update from parent repository
Use the hwloc synthetic topology string as the signature instead of our custom attempt at counting number of types of objects - the synthetic retains some hierarchical info and hopefully does a little better job of detecting hetero nodes are in use. Signed-off-by: Ralph Castain <rhc@pmix.org>
Update the MCA param help message to clarify what the param does and what values it supports. Cleanup an error where we would overwrite the resulting list of signals to forward. Cleanup the return value so we don't generate spurious error log output. Provide verbose output showing the signals being forwarded. Signed-off-by: Ralph Castain <rhc@pmix.org>
Further improve automatic handling of hetero nodes by making the non-symmetric signature unique, thereby forcing collection of the full topology from each such node. Fix an error in the topology retrieval procedure whereby we double-counted cached nodes, thereby causing us to quit collecting topologies early. Signed-off-by: Ralph Castain <rhc@pmix.org>
Need to init the ess framework to have the signal forwarding list initialized Signed-off-by: Ralph Castain <rhc@pmix.org>
jsquyres
approved these changes
Nov 14, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.