PyTorch C ++ API

一、PyTorch C++ API 概览

PyTorch 的 C++ API 提供了一个强大的工具集，用于在 C++ 环境中进行张量计算和深度学习模型开发。它主要包括以下几个部分：

1.1 ATen 库

ATen 是 PyTorch 的基础张量库，提供了丰富的张量操作和数学运算功能。

#include <ATen/ATen.h>


at::Tensor a = at::ones({2, 2}, at::kInt);
at::Tensor b = at::randn({2, 2});
auto c = a + b.to(at::kInt);

1.2 Autograd 自动求导

Autograd 是 PyTorch C++ API 的自动微分组件，扩展了 ATen 的功能，使其支持自动求导。

#include <torch/torch.h>


torch::Tensor a = torch::ones({2, 2}, torch::requires_grad());
torch::Tensor b = torch::randn({2, 2});
auto c = a + b;
c.backward();

1.3 C++ 前端

C++ 前端提供了高层接口，用于构建和训练神经网络模型。

#include <torch/torch.h>


class SimpleModel : public torch::nn::Module {
public:
    SimpleModel() {
        linear = register_module("linear", torch::nn::Linear(10, 2));
    }


    torch::Tensor forward(torch::Tensor x) {
        return linear(x);
    }


private:
    torch::nn::Linear linear;
};


int main() {
    SimpleModel model;
    torch::Tensor input = torch::randn({1, 10});
    torch::Tensor output = model(input);
    return 0;
}

1.4 TorchScript 支持

TorchScript 是 PyTorch 的 JIT 编译器和解释器，支持模型的序列化和优化。

#include <torch/torch.h>


int main() {
    // 加载 TorchScript 模型
    torch::jit::script::Module model;
    model.load("model.pt");


    // 执行模型推理
    torch::Tensor input = torch::randn({1, 3, 224, 224});
    torch::Tensor output = model.forward({input});


    return 0;
}

1.5 C++ 扩展

C++ 扩展允许开发者通过自定义 C++ 和 CUDA 代码扩展 PyTorch 的功能。

#include <torch/torch.h>


torch::Tensor custom_add(torch::Tensor x, torch::Tensor y) {
    return x + y;
}


PYBIND11_MODULE(custom_ops, m) {
    m.def("custom_add", &custom_add, "A custom add operation");
}

二、开发环境搭建

2.1 安装 PyTorch C++ API

可以从 PyTorch 官方网站获取安装包，或通过源代码编译。

## 使用 conda 安装
conda install pytorch torchvision torchaudio pytorch-cpp -c pytorch

2.2 配置开发工具

推荐使用支持 C++17 的编译器，如 GCC 9 或更高版本。同时，可以使用 CMake 来管理项目构建。

cmake_minimum_required(VERSION 3.18)
project(MyPyTorchProject)


find_package(Torch REQUIRED)


add_executable(my_app main.cpp)
target_link_libraries(my_app PRIVATE ${TORCH_LIBRARIES})

三、模型开发与训练

3.1 定义模型

使用 torch::nn::Module 定义神经网络模型。

#include <torch/torch.h>


class MyModel : public torch::nn::Module {
public:
    MyModel() {
        conv1 = register_module("conv1", torch::nn::Conv2d(
            torch::nn::Conv2dOptions(1, 20, 5)
        ));
        fc1 = register_module("fc1", torch::nn::Linear(20 * 20 * 20, 500));
        fc2 = register_module("fc2", torch::nn::Linear(500, 10));
    }


    torch::Tensor forward(torch::Tensor x) {
        x = torch::functional::ReLU(conv1(x));
        x = torch::max_pool2d(x, 2);
        x = x.view({-1, 20 * 20 * 20});
        x = torch::functional::ReLU(fc1(x));
        x = fc2(x);
        return x;
    }


private:
    torch::nn::Conv2d conv1;
    torch::nn::Linear fc1, fc2;
};

3.2 数据加载与处理

使用 torch::data 加载和处理数据。

#include <torch/data.h>
#include <torch/datasets.h>


using namespace torch::data;


class MyDataset : public torch::data::Dataset<MyDataset> {
public:
    MyDataset(std::string path) : path_(std::move(path)) {}


    torch::data::Example<> get(size_t index) override {
        // 实现数据加载逻辑
        torch::Tensor data = ...;
        torch::Tensor target = ...;
        return {data, target};
    }


    torch::optional<size_t> size() const override {
        return 1000; // 数据集大小
    }


private:
    std::string path_;
};


int main() {
    auto dataset = MyDataset("data");
    auto dataloader = make_data_loader(
        dataset,
        DataLoaderOptions().batch_size(32).workers(4)
    );


    for (auto& batch : *dataloader) {
        auto data = batch.data;
        auto target = batch.target;
        // 训练逻辑
    }


    return 0;
}

3.3 模型训练与优化

使用优化器进行模型训练。

#include <torch/optim.h>


int main() {
    MyModel model;
    auto optimizer = torch::optim::SGD(
        model.parameters(),
        torch::optim::SGDOptions(0.01).momentum(0.9)
    );


    for (auto& batch : *dataloader) {
        auto data = batch.data;
        auto target = batch.target;


        optimizer.zero_grad();
        auto output = model(data);
        auto loss = torch::nn::functional::nll_loss(output, target);
        loss.backward();
        optimizer.step();
    }


    return 0;
}

四、模型推理与部署

4.1 模型保存与加载

保存和加载模型参数或整个模型。

#include <torch/serialize.h>


int main() {
    MyModel model;


    // 保存模型
    torch::save(model, "model.pth");


    // 加载模型
    MyModel loaded_model;
    torch::load(loaded_model, "model.pth");


    return 0;
}

4.2 TorchScript 模型推理

加载并运行 TorchScript 模型。

#include <torch/jit.h>


int main() {
    // 加载 TorchScript 模型
    torch::jit::script::Module model;
    model.load("model.pt");


    // 模型推理
    torch::Tensor input = torch::randn({1, 3, 224, 224});
    std::vector<torch::jit::IValue> inputs;
    inputs.push_back(input);
    torch::Tensor output = model.forward(inputs).toTensor();


    return 0;
}

五、性能优化技巧

5.1 使用混合精度训练

在训练过程中使用混合精度加速计算。

#include <torchcuda.h>


int main() {
    MyModel model;
    model.cuda();


    auto scaler = torch::cuda::amp::GradScaler();
    for (auto& batch : *dataloader) {
        auto data = batch.data.cuda();
        auto target = batch.target.cuda();


        scaler.scale(loss).backward();
        scaler.step(optimizer);
        scaler.update();
    }


    return 0;
}

5.2 多 GPU 并行训练

使用多 GPU 进行模型并行训练。

#include <torch/distributed.h>
#include <torch/data/distributed.h>


int main() {
    // 初始化分布式环境
    torch::distributed::init_process_group(torch::distributed::Backend::NCCL, std::string("env://"));


    int rank = torch::distributed::get_rank();
    int world_size = torch::distributed::get_world_size();


    MyModel model;
    model.cuda(rank);


    // 数据并行
    auto model_ddp = torch::nn::DataParallel(model);


    return 0;
}

六、总结与展望

PyTorch C++ API 提供了强大的功能，使开发者能够在 C++ 环境中高效地进行深度学习模型的开发和部署。通过合理利用 ATen、Autograd、C++ 前端、TorchScript 和 C++ 扩展，可以构建高性能的机器学习应用。

关注编程狮（W3Cschool）平台，获取更多 PyTorch C++ API 开发相关的教程和案例。