performans - Code Search

RELEASE.md

        replicas taking part in sync training.
    *   Performance improvements for GPU multi-worker distributed training using
        `tf.distribute.experimental.MultiWorkerMirroredStrategy`
    *   Update NVIDIA `NCCL` to `2.5.7-1` for better performance and performance
        tuning. Please see

Plain Text

- Registered: Tue May 07 12:40:20 GMT 2024

- Last Modified: Mon Apr 29 19:17:57 GMT 2024

- 727.7K bytes

- Viewed (8)

github.com/tensorflow/tensorflow

configure.py

    default_cc_opt_flags = '/arch:AVX'
  else:
    # On all other platforms, no longer use `-march=native` as this can result
    # in instructions that are too modern being generated. Users that want
    # maximum performance should compile TF in their environment and can pass
    # `-march=native` there.
    # See https://github.com/tensorflow/tensorflow/issues/45744 and duplicates
    default_cc_opt_flags = '-Wno-sign-compare'

Python

- Registered: Tue Apr 30 12:39:09 GMT 2024

- Last Modified: Mon Apr 15 18:25:36 GMT 2024

- 53.8K bytes

- Viewed (1)

github.com/tensorflow/tensorflow

tensorflow/c/eager/tape.h

  if (!s.ok()) {
    return s;
  }

  std::unordered_map<int64_t, int64_t> gradients_size;
  // TODO(apassos) multiple threads could be dequeuing from op_stack at the same
  // time, for better CPU backprop performance.
  VLOG(1) << "Initial stack:";
  if (VLOG_IS_ON(1)) {
    for (auto t : op_stack) {
      VLOG(1) << "  " << t;
    }
  }
  while (!op_stack.empty()) {
    const int64_t op = op_stack.back();

C

- Registered: Tue Apr 30 12:39:09 GMT 2024

- Last Modified: Tue Apr 02 12:40:29 GMT 2024

- 47.2K bytes

- Viewed (1)

github.com/tensorflow/tensorflow

.bazelrc

# Linux x86 so that we can use RBE. Since tests still need to run on the single
# host Arm64 machine, the build becomes too slow (~30 min) to be a presubmit.
# For testing purposes, we want to see the runtime performance of an
# experimental job that is build-only, i.e, we only build the test targets and
# do not run them. By prefixing the configs with "build", we can run both

Plain Text

- Registered: Tue May 07 12:40:20 GMT 2024

- Last Modified: Thu May 02 19:34:20 GMT 2024

- 52.8K bytes

- Viewed (2)

github.com/tensorflow/tensorflow

tensorflow/c/c_api_function.cc

                                    fn_name, "'");
    output_tensors->emplace_back(node, idx);
  }
  return absl::OkStatus();
}

// Populates `body_nodes` with the nodes that will become function's body.
// Performs various checks.
Status ComputeBodyNodes(
    const TF_Graph* fn_body, const char* fn_name, int num_opers,
    const TF_Operation* const* opers,
    const std::unordered_map<const Node*, std::vector<int>>& input_nodes,

C++

- Registered: Tue Apr 30 12:39:09 GMT 2024

- Last Modified: Mon Apr 15 03:35:10 GMT 2024

- 13.6K bytes

- Viewed (2)

Search Options

RELEASE.md

configure.py

tensorflow/c/eager/tape.h

.bazelrc

tensorflow/c/c_api_function.cc