-
Notifications
You must be signed in to change notification settings - Fork 0
/
remote-backend.html
211 lines (171 loc) · 9.64 KB
/
remote-backend.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="robots" content="index, follow" />
<meta name="keywords" content="OpenCL portable OpenCL PoCL pocl Portable Computing Langauge" />
<meta name="description" content="PoCL - Portable Computing Language" />
<meta property="og:title" content="PoCL home page"/>
<meta property="og:site_name" content="PoCL"/>
<meta property="og:type" content="website"/>
<meta property="og:description" content="PoCL: a performance portable open source OpenCL implementation"/>
<meta property="og:url" content="http://portablecl.org"/>
<title>PoCL - Portable Computing Language | No-MPI OpenCL-Only Distributed Computing With PoCL-Remote</title>
<link rel="stylesheet" type="text/css" href="pocl-style.css" />
</head>
<body>
<div id="page">
<div id="header">
<h1 id="title"><span style="height: 100%; vertical-align: middle;"></span>
<a href="http://portablecl.org"><img src="img/pocl-80x60.png" border="0" style="vertical-align: middle;"></a>
<span style="height: 100%; vertical-align: middle;"> Portable Computing Language | No-MPI OpenCL-Only Distributed Computing With PoCL-Remote</span></h1>
</div>
<div id="navi">
<ul id="menu_item_list">
<li class="menu_item"><a href="index.html" class="menu_link">Main</a></li>
<li class="menu_item"><a href="download.html" class="menu_link">Download</a></li>
<li class="menu_item"><a href="docs/html" class="menu_link">Documentation</a></li>
<li class="menu_item"><a href="development.html" class="menu_link">Development</a></li>
<li class="menu_item"><a href="discussion.html" class="menu_link">Discussion</a></li>
<li class="menu_item"><a href="https://github.com/pocl/pocl/wiki" class="menu_link">Wiki</a></li>
<li class="menu_item"><a href="publications.html" class="menu_link">Publications</a></li>
</ul>
</div>
<div id="content">
<h1>September 4th, 2023: No-MPI OpenCL-Only Distributed Computing With PoCL-Remote</h1>
<p>PoCL now has a new backend that allows transparently
offloading OpenCL tasks to other nodes on the network, thus enabling
distributing compute without using MPI or similar APIs. Since the
standard OpenCL API suffices, compute offloading can be performed
identically whether using local or remote devices, which makes it
useful for selective/adaptive edge offloading and other use cases.</p>
<p>The driver is now considered ready for out-of-lab testing and has been
integrated to the <a href="http://code.portablecl.org">main</a> branch for the
upcoming v5.0 release. The work was primarily carried out by Michal Babej,
Jan Solanti and Pekka Jääskeläinen in the <a href="http://tuni.fi/cpc">Customized Parallel Computing</a>
group at <a href="http://tuni.fi/en">Tampere University</a> within multiple
European and national research projects.</p>
<p>PoCL-Remote follows a client-server architecture and is comprised of two
parts: The <code>remote</code> driver in the PoCL library and
<code>pocld</code> daemon.</p>
<p>The daemon can be run on any machine that is reachable via TCP/IP
and has an OpenCL implementation available. Any OpenCL implementation/driver/device
works on the server-side, not only PoCL-based drivers. The client library connects to the
configured daemon(s) and lists the OpenCL devices available on them in
<code>clGetDeviceIDs</code> just as if they were local to the client.</p>
<div style="text-align: center;">
<img src="img/pocl-remote.svg" border="0" style="width: 60%;" />
</div>
<p>
In contrast to existing networked offloading solutions for OpenCL, PoCL-Remote
makes use of PoCL's memory management infrastructure to keep track of memory
objects and only copy them around when actually necessary. When a memory
object migration is needed, the most efficient route for the transfer is
automatically chosen:
<ul>
<li>If both the source and destination device are part of the same native
OpenCL context on the machine running the daemon, the migration is delegated
to the underlying native driver.</li>
<li>If the devices are part of separate OpenCL platforms, the memory contents
are manually copied through host memory within the daemon.</li>
<li>For copies between two daemons, direct peer-to-peer connections are utilised
in order to minimize the amount of traffic to and from the client.</li>
<li>Should all else fail, the memory contents are downloaded into the client's
memory and uploaded to the destination from there.</li>
</ul>
Similarly, synchronisation between commands is done with OpenCL events within
the daemon and OpenCL user events that are signaled in peer-to-peer fashion
for dependencies between daemons. In addition to automatically choosing the
most efficient route for communication, PoCL-Remote is also able to use size
buffers as specified in the <emph>cl_pocl_content_size</emph> extension to
only transfer the meaningful portion of buffers with variable sized content
such as compressed image or video data.
</p>
<p>Early versions of PoCL-R or experiments using it have been presented at
<a href="https://doi.org/10.1145/3388333.3388642">IWOCL '20</a>,
<a href="https://doi.org/10.1007/978-3-031-04580-6_6">SAMOS 2021</a> and
<a href="http://doi.org/10.1145/3585341.3585376">IWOCL '23</a>.
There is also a full length journal article under review which describes the published
version (for example its RDMA support). A preprint of it is available in
<a href="https://doi.org/10.48550/arXiv.2309.00407">arXiv</a>.
</p>
<p>
More information, instructions for building and using the PoCL-Remote backend can be found in the
<a href="http://portablecl.org/docs/html/remote.html">user manual</a>.
</p>
<div style="text-align: center;">
<a href="img/pocl-remote-screenshot-august2023.png">
<img src="img/pocl-remote-screenshot-august2023.png" border="0" width="80%" style="vertical-align: middle; " />
</a>
</div>
<h2>Status and Maturity</h2>
<p>
The backend passes most of PoCL's basic test suite and has been successfully
used to run complex applications such as
<a href="https://github.com/ProjectPhysX/FluidX3D">FluidX3D</a> and various
computer vision and machine learning demos. The actual usable set of features
is naturally also dependent on the native driver controlled by the daemon.
</p>
<p>PoCL-R has been mainly tested within the research group but integrated to proper
demonstrators, thus can be considered TRL4-TRL5 in the EU scale.
</p>
<p>
In terms of performance there is of course a major penalty when having to transfer
buffers across the network. However this is mostly noticeable in buffer transfers
(Write-/Copy-/ReadBuffer and migrating a buffer from one device to another).
These can easily become a bottleneck for other drivers as well so designing
applications with that in mind is advisable in general.</p>
<p>
It is worth noting that PoCL-R will leave buffers resident on devices after
use, so unchanged buffers do not need to be transferred again on next use.
This means that static buffers such as neural network coefficients only need
to be uploaded once during launch and afterwards inference can be performed
repeatedly without this initial buffer transfer cost.
</p>
<p>
In multi-server setups the effects of server to server transfers can be
mitigated somewhat by building PoCL with RDMA support enabled, if RDMA is
supported by the networking hardware.
</p>
<h4>Known Limitations (as of 2023-10-03)</h4>
<ul>
<li>There is no traffic encryption or user authentication on the daemon side, making PoCL-R currently
not suitable for use outside of closed private networks/clusters.</li>
<li>PoCL must be built with LOADABLE_DRIVERS=OFF, else initialisation of the remote backend fails.</li>
<li>While printf does somewhat work, it will likely behave differently from what applications expect.</li>
</ul>
<h2>Contributing</h2>
<p>We welcome any contributions in the form of good quality bug reports and pull requests,
but cannot commit to rapid support if the issue does not affect us due to a limited
number of developers working on the project. If you're interested in improving PoCL-R,
but aren't sure what to work on, please ask in the
<a href="https://lists.sourceforge.net/lists/listinfo/pocl-devel">mailing list</a> or the
<a href="https://github.com/pocl/pocl/discussions">discussion forum</a> for more information.
</p>
</div>
<div id="footer">
<span style="height: 100%; vertical-align: middle;"></span>
<a href="http://portablecl.org"><img src="img/pocl-80x60.png" border="0" style="vertical-align: middle;"></a>
<span style="height: 100%; vertical-align: middle;">Portable Computing Language © 2010-2023 PoCL developers
</span>
</div>
<div class="g-plusone" data-annotation="inline" data-width="300"></div>
</div>
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-36911879-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
(function() {
var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true;
po.src = 'https://apis.google.com/js/plusone.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s);
})();
</script>
</body>
</html>