Step-5: PFASST¶
In this step, we will show how pySDC can do PFASST runs (virtually parallel for now).
Part A: Multistep multilevel hierarchy¶
In this first part, we create a controller and demonstrate how pySDC’s data structures represent multiple time-steps.
While for SDC the step
data structure is the key part, we now have simply a list of steps, bundled in the MS
attribute of the controller.
The nice thing about going form MLSDC to PFASST is that only the number of processes in the num_procs
variable has to be changed.
This way the controller knows that multiple steps have to be computed in parallel.
Important things to note:
To avoid the tedious installation of mpi4py and to have full access to all data at all times, the controllers with the
_nonMPI
suffix only emulate parallelism. The algorithm is the same, but the steps are performed serially. UsingMPI
controllers allows for real parallelism and should yield the same results (see next tutorial step).While in principle all steps can have a different number of levels, the controllers implemented so far assume that the number of levels is constant. Also, the instantiation of (the list of) steps via the controllers is implemented only for this case. Yet, pySDC’s data structures in principle allow for different approaches.
Full code: pySDC/tutorial/step_5/A_multistep_multilevel_hierarchy.py
from pathlib import Path
from pySDC.implementations.controller_classes.controller_nonMPI import controller_nonMPI
from pySDC.implementations.problem_classes.HeatEquation_ND_FD import heatNd_forced
from pySDC.implementations.sweeper_classes.imex_1st_order import imex_1st_order
from pySDC.implementations.transfer_classes.TransferMesh import mesh_to_mesh
def main():
"""
A simple test program to setup a full multi-step multi-level hierarchy
"""
# initialize level parameters
level_params = {}
level_params['restol'] = 1e-10
level_params['dt'] = 0.5
# initialize sweeper parameters
sweeper_params = {}
sweeper_params['quad_type'] = 'RADAU-RIGHT'
sweeper_params['num_nodes'] = [3]
# initialize problem parameters
problem_params = {}
problem_params['nu'] = 0.1 # diffusion coefficient
problem_params['freq'] = 4 # frequency for the test value
problem_params['nvars'] = [31, 15, 7] # number of degrees of freedom for each level
problem_params['bc'] = 'dirichlet-zero' # boundary conditions
# initialize step parameters
step_params = {}
step_params['maxiter'] = 20
# initialize space transfer parameters
space_transfer_params = {}
space_transfer_params['rorder'] = 2
space_transfer_params['iorder'] = 6
# fill description dictionary for easy step instantiation
description = {}
description['problem_class'] = heatNd_forced # pass problem class
description['problem_params'] = problem_params # pass problem parameters
description['sweeper_class'] = imex_1st_order # pass sweeper (see part B)
description['sweeper_params'] = sweeper_params # pass sweeper parameters
description['level_params'] = level_params # pass level parameters
description['step_params'] = step_params # pass step parameters
description['space_transfer_class'] = mesh_to_mesh # pass spatial transfer class
description['space_transfer_params'] = space_transfer_params # pass parameters for spatial transfer
# instantiate controller
controller = controller_nonMPI(num_procs=10, controller_params={}, description=description)
# check number of levels
Path("data").mkdir(parents=True, exist_ok=True)
f = open('data/step_5_A_out.txt', 'w')
for i in range(len(controller.MS)):
out = "Process %2i has %2i levels" % (i, len(controller.MS[i].levels))
f.write(out + '\n')
print(out)
f.close()
assert all(len(S.levels) == 3 for S in controller.MS), "ERROR: not all steps have the same number of levels"
if __name__ == "__main__":
main()
Results:
Process 0 has 3 levels
Process 1 has 3 levels
Process 2 has 3 levels
Process 3 has 3 levels
Process 4 has 3 levels
Process 5 has 3 levels
Process 6 has 3 levels
Process 7 has 3 levels
Process 8 has 3 levels
Process 9 has 3 levels
Part B: My first PFASST run¶
After we have created our multistep-multilevel hierarchy, we are now ready to run PFASST.
We choose the simple heat equation for our first test.
One of the most important characteristics of a parallel-in-time algorithm is its behavior for increasing number of parallel time-steps for a given problem (i.e. with dt
and Tend
fixed).
Therefore, we loop over the number of parallel time-steps in this example to see how PFASST performs for 1, 2, …, 16 parallel steps.
We compute and check the error as well as multiple statistical quantities, e.g. the mean number of iterations, the range of iteration counts and so on.
We see that PFASST performs very well in this case, the iteration counts do not increase significantly.
Important things to note:
In the IMEX sweeper, we can activate the LU-trick for the implicit part by specifying
QI
asLU
. For stiff parabolic problems with Gauss-Radau nodes, this is usually a very good idea!As usual for MLSDC and PFASST, the success depends heavily on the choice of parameters. Making the problem more complicated, less/more stiff, changing the order of the spatial interpolation etc. can give completely different results.
Full code: pySDC/tutorial/step_5/B_my_first_PFASST_run.py
from pathlib import Path
import numpy as np
from pySDC.helpers.stats_helper import get_sorted
from pySDC.implementations.controller_classes.controller_nonMPI import controller_nonMPI
from pySDC.implementations.problem_classes.HeatEquation_ND_FD import heatNd_forced
from pySDC.implementations.sweeper_classes.imex_1st_order import imex_1st_order
from pySDC.implementations.transfer_classes.TransferMesh import mesh_to_mesh
def main():
"""
A simple test program to do PFASST runs for the heat equation
"""
# initialize level parameters
level_params = dict()
level_params['restol'] = 1e-10
level_params['dt'] = 0.25
# initialize sweeper parameters
sweeper_params = dict()
sweeper_params['quad_type'] = 'RADAU-RIGHT'
sweeper_params['num_nodes'] = [3]
sweeper_params['QI'] = 'LU' # For the IMEX sweeper, the LU-trick can be activated for the implicit part
# initialize problem parameters
problem_params = dict()
problem_params['nu'] = 0.1 # diffusion coefficient
problem_params['freq'] = 8 # frequency for the test value
problem_params['nvars'] = [511, 255] # number of degrees of freedom for each level
problem_params['bc'] = 'dirichlet-zero' # boundary conditions
# initialize step parameters
step_params = dict()
step_params['maxiter'] = 50
# initialize space transfer parameters
space_transfer_params = dict()
space_transfer_params['rorder'] = 2
space_transfer_params['iorder'] = 6
# initialize controller parameters
controller_params = dict()
controller_params['logger_level'] = 30
controller_params['predict_type'] = 'pfasst_burnin'
# fill description dictionary for easy step instantiation
description = dict()
description['problem_class'] = heatNd_forced # pass problem class
description['problem_params'] = problem_params # pass problem parameters
description['sweeper_class'] = imex_1st_order # pass sweeper (see part B)
description['sweeper_params'] = sweeper_params # pass sweeper parameters
description['level_params'] = level_params # pass level parameters
description['step_params'] = step_params # pass step parameters
description['space_transfer_class'] = mesh_to_mesh # pass spatial transfer class
description['space_transfer_params'] = space_transfer_params # pass parameters for spatial transfer
# set time parameters
t0 = 0.0
Tend = 4.0
# set up list of parallel time-steps to run PFASST with
nsteps = int(Tend / level_params['dt'])
num_proc_list = [2**i for i in range(int(np.log2(nsteps) + 1))]
Path("data").mkdir(parents=True, exist_ok=True)
f = open('data/step_5_B_out.txt', 'w')
# loop over different number of processes and check results
for num_proc in num_proc_list:
out = 'Working with %2i processes...' % num_proc
f.write(out + '\n')
print(out)
# instantiate controller
controller = controller_nonMPI(num_procs=num_proc, controller_params=controller_params, description=description)
# get initial values on finest level
P = controller.MS[0].levels[0].prob
uinit = P.u_exact(t0)
# call main function to get things done...
uend, stats = controller.run(u0=uinit, t0=t0, Tend=Tend)
# compute exact solution and compare
uex = P.u_exact(Tend)
err = abs(uex - uend)
# filter statistics by type (number of iterations)
iter_counts = get_sorted(stats, type='niter', sortby='time')
# compute and print statistics
for item in iter_counts:
out = 'Number of iterations for time %4.2f: %2i' % item
f.write(out + '\n')
print(out)
f.write('\n')
print()
niters = np.array([item[1] for item in iter_counts])
out = ' Mean number of iterations: %4.2f' % np.mean(niters)
f.write(out + '\n')
print(out)
out = ' Range of values for number of iterations: %2i ' % np.ptp(niters)
f.write(out + '\n')
print(out)
out = ' Position of max/min number of iterations: %2i -- %2i' % (
int(np.argmax(niters)),
int(np.argmin(niters)),
)
f.write(out + '\n')
print(out)
out = ' Std and var for number of iterations: %4.2f -- %4.2f' % (float(np.std(niters)), float(np.var(niters)))
f.write(out + '\n')
print(out)
f.write('\n\n')
print()
print()
assert err < 1.3505e-04, "ERROR: error is too high, got %s" % err
assert np.ptp(niters) <= 1, "ERROR: range of number of iterations is too high, got %s" % np.ptp(niters)
assert np.mean(niters) <= 5.0, "ERROR: mean number of iterations is too high, got %s" % np.mean(niters)
f.close()
if __name__ == "__main__":
main()
Results:
Working with 1 processes...
Number of iterations for time 0.00: 5
Number of iterations for time 0.25: 5
Number of iterations for time 0.50: 5
Number of iterations for time 0.75: 5
Number of iterations for time 1.00: 5
Number of iterations for time 1.25: 5
Number of iterations for time 1.50: 5
Number of iterations for time 1.75: 5
Number of iterations for time 2.00: 5
Number of iterations for time 2.25: 5
Number of iterations for time 2.50: 5
Number of iterations for time 2.75: 5
Number of iterations for time 3.00: 4
Number of iterations for time 3.25: 5
Number of iterations for time 3.50: 5
Number of iterations for time 3.75: 5
Mean number of iterations: 4.94
Range of values for number of iterations: 1
Position of max/min number of iterations: 0 -- 12
Std and var for number of iterations: 0.24 -- 0.06
Working with 2 processes...
Number of iterations for time 0.00: 5
Number of iterations for time 0.25: 5
Number of iterations for time 0.50: 5
Number of iterations for time 0.75: 5
Number of iterations for time 1.00: 5
Number of iterations for time 1.25: 5
Number of iterations for time 1.50: 5
Number of iterations for time 1.75: 5
Number of iterations for time 2.00: 5
Number of iterations for time 2.25: 5
Number of iterations for time 2.50: 5
Number of iterations for time 2.75: 5
Number of iterations for time 3.00: 4
Number of iterations for time 3.25: 4
Number of iterations for time 3.50: 5
Number of iterations for time 3.75: 5
Mean number of iterations: 4.88
Range of values for number of iterations: 1
Position of max/min number of iterations: 0 -- 12
Std and var for number of iterations: 0.33 -- 0.11
Working with 4 processes...
Number of iterations for time 0.00: 5
Number of iterations for time 0.25: 5
Number of iterations for time 0.50: 5
Number of iterations for time 0.75: 5
Number of iterations for time 1.00: 5
Number of iterations for time 1.25: 5
Number of iterations for time 1.50: 5
Number of iterations for time 1.75: 5
Number of iterations for time 2.00: 5
Number of iterations for time 2.25: 5
Number of iterations for time 2.50: 5
Number of iterations for time 2.75: 5
Number of iterations for time 3.00: 4
Number of iterations for time 3.25: 4
Number of iterations for time 3.50: 4
Number of iterations for time 3.75: 4
Mean number of iterations: 4.75
Range of values for number of iterations: 1
Position of max/min number of iterations: 0 -- 12
Std and var for number of iterations: 0.43 -- 0.19
Working with 8 processes...
Number of iterations for time 0.00: 5
Number of iterations for time 0.25: 5
Number of iterations for time 0.50: 5
Number of iterations for time 0.75: 5
Number of iterations for time 1.00: 5
Number of iterations for time 1.25: 5
Number of iterations for time 1.50: 5
Number of iterations for time 1.75: 5
Number of iterations for time 2.00: 5
Number of iterations for time 2.25: 5
Number of iterations for time 2.50: 5
Number of iterations for time 2.75: 5
Number of iterations for time 3.00: 5
Number of iterations for time 3.25: 5
Number of iterations for time 3.50: 5
Number of iterations for time 3.75: 5
Mean number of iterations: 5.00
Range of values for number of iterations: 0
Position of max/min number of iterations: 0 -- 0
Std and var for number of iterations: 0.00 -- 0.00
Working with 16 processes...
Number of iterations for time 0.00: 5
Number of iterations for time 0.25: 5
Number of iterations for time 0.50: 5
Number of iterations for time 0.75: 5
Number of iterations for time 1.00: 5
Number of iterations for time 1.25: 5
Number of iterations for time 1.50: 5
Number of iterations for time 1.75: 5
Number of iterations for time 2.00: 5
Number of iterations for time 2.25: 5
Number of iterations for time 2.50: 5
Number of iterations for time 2.75: 5
Number of iterations for time 3.00: 5
Number of iterations for time 3.25: 5
Number of iterations for time 3.50: 5
Number of iterations for time 3.75: 5
Mean number of iterations: 5.00
Range of values for number of iterations: 0
Position of max/min number of iterations: 0 -- 0
Std and var for number of iterations: 0.00 -- 0.00
Part C: Advection and PFASST¶
We saw in the last part that PFASST does perform very well for certain parabolic problems. Now, we test PFASST for an advection test case to see how things go then. The basic setup is the same, but now using only an implicit sweeper and periodic boundary conditions. To make things more interesting, we choose two different sweepers: the LU-trick as well as the implicit Euler and check how these are performing for this kind of problem. We see that in contrast to the parabolic problem, the iteration counts actually increase significantly, if more parallel time-steps are computed. Again, this heavily depends on the actual problem under consideration, but it is a typical behavior of parallel-in-time algorithms of this type.
Important things to note:
The setup is actually periodic in time as well! At
Tend = 1
the exact solution looks exactly like the initial condition.Like the standard IMEX sweeper, the
generic_implicit
sweeper allows the user to change the preconditioner, namedQI
. To get the standard implicit Euler scheme, chooseIE
, while for the LU-trick, chooseLU
. More choices have been implemented inpySDC.plugins.sweeper_helper.get_Qd
.
Full code: pySDC/tutorial/step_5/C_advection_and_PFASST.py
from pathlib import Path
import numpy as np
from pySDC.helpers.stats_helper import get_sorted
from pySDC.implementations.controller_classes.controller_nonMPI import controller_nonMPI
from pySDC.implementations.problem_classes.AdvectionEquation_ND_FD import advectionNd
from pySDC.implementations.sweeper_classes.generic_implicit import generic_implicit
from pySDC.implementations.transfer_classes.TransferMesh import mesh_to_mesh
def main():
"""
A simple test program to run PFASST for the advection equation in multiple ways...
"""
# initialize level parameters
level_params = dict()
level_params['restol'] = 1e-09
level_params['dt'] = 0.0625
# initialize sweeper parameters
sweeper_params = dict()
sweeper_params['quad_type'] = 'RADAU-RIGHT'
sweeper_params['num_nodes'] = [3]
# initialize problem parameters
problem_params = dict()
problem_params['c'] = 1 # advection coefficient
problem_params['freq'] = 4 # frequency for the test value
problem_params['nvars'] = [128, 64] # number of degrees of freedom for each level
problem_params['order'] = 4
problem_params['bc'] = 'periodic'
problem_params['stencil_type'] = 'center'
# initialize step parameters
step_params = dict()
step_params['maxiter'] = 50
# initialize space transfer parameters
space_transfer_params = dict()
space_transfer_params['rorder'] = 2
space_transfer_params['iorder'] = 6
space_transfer_params['periodic'] = True
# initialize controller parameters
controller_params = dict()
controller_params['logger_level'] = 30
controller_params['predict_type'] = 'pfasst_burnin'
# fill description dictionary for easy step instantiation
description = dict()
description['problem_class'] = advectionNd # pass problem class
description['problem_params'] = problem_params # pass problem parameters
description['sweeper_class'] = generic_implicit # pass sweeper (see part B)
description['level_params'] = level_params # pass level parameters
description['step_params'] = step_params # pass step parameters
description['space_transfer_class'] = mesh_to_mesh # pass spatial transfer class
description['space_transfer_params'] = space_transfer_params # pass parameters for spatial transfer
# set time parameters
t0 = 0.0
Tend = 1.0
# set up list of parallel time-steps to run PFASST with
nsteps = int(Tend / level_params['dt'])
num_proc_list = [2**i for i in range(int(np.log2(nsteps) + 1))]
# set up list of types of implicit SDC sweepers: LU and implicit Euler here
QI_list = ['LU', 'IE']
niters_min_all = {}
niters_max_all = {}
Path("data").mkdir(parents=True, exist_ok=True)
f = open('data/step_5_C_out.txt', 'w')
# loop over different types of implicit sweeper types
for QI in QI_list:
# define and set preconditioner for the implicit sweeper
sweeper_params['QI'] = QI
description['sweeper_params'] = sweeper_params # pass sweeper parameters
# init min/max iteration counts
niters_min_all[QI] = 99
niters_max_all[QI] = 0
# loop over different number of processes
for num_proc in num_proc_list:
out = 'Working with QI = %s on %2i processes...' % (QI, num_proc)
f.write(out + '\n')
print(out)
# instantiate controller
controller = controller_nonMPI(
num_procs=num_proc, controller_params=controller_params, description=description
)
# get initial values on finest level
P = controller.MS[0].levels[0].prob
uinit = P.u_exact(t0)
# call main function to get things done...
uend, stats = controller.run(u0=uinit, t0=t0, Tend=Tend)
# compute exact solution and compare
uex = P.u_exact(Tend)
err = abs(uex - uend)
# filter statistics by type (number of iterations)
iter_counts = get_sorted(stats, type='niter', sortby='time')
# compute and print statistics
niters = np.array([item[1] for item in iter_counts])
niters_min_all[QI] = min(np.mean(niters), niters_min_all[QI])
niters_max_all[QI] = max(np.mean(niters), niters_max_all[QI])
out = ' Mean number of iterations: %4.2f' % np.mean(niters)
f.write(out + '\n')
print(out)
out = ' Range of values for number of iterations: %2i ' % np.ptp(niters)
f.write(out + '\n')
print(out)
out = ' Position of max/min number of iterations: %2i -- %2i' % (
int(np.argmax(niters)),
int(np.argmin(niters)),
)
f.write(out + '\n')
print(out)
out = ' Std and var for number of iterations: %4.2f -- %4.2f' % (
float(np.std(niters)),
float(np.var(niters)),
)
f.write(out + '\n')
f.write(out + '\n')
print(out)
f.write('\n')
print()
assert err < 5.1365e-04, "ERROR: error is too high, got %s" % err
out = 'Mean number of iterations went up from %4.2f to %4.2f for QI = %s!' % (
niters_min_all[QI],
niters_max_all[QI],
QI,
)
f.write(out + '\n')
print(out)
f.write('\n\n')
print()
print()
f.close()
if __name__ == "__main__":
main()
Results:
Working with QI = LU on 1 processes...
Mean number of iterations: 5.00
Range of values for number of iterations: 0
Position of max/min number of iterations: 0 -- 0
Std and var for number of iterations: 0.00 -- 0.00
Std and var for number of iterations: 0.00 -- 0.00
Working with QI = LU on 2 processes...
Mean number of iterations: 5.50
Range of values for number of iterations: 1
Position of max/min number of iterations: 1 -- 0
Std and var for number of iterations: 0.50 -- 0.25
Std and var for number of iterations: 0.50 -- 0.25
Working with QI = LU on 4 processes...
Mean number of iterations: 6.50
Range of values for number of iterations: 3
Position of max/min number of iterations: 3 -- 0
Std and var for number of iterations: 1.12 -- 1.25
Std and var for number of iterations: 1.12 -- 1.25
Working with QI = LU on 8 processes...
Mean number of iterations: 8.94
Range of values for number of iterations: 9
Position of max/min number of iterations: 7 -- 0
Std and var for number of iterations: 2.82 -- 7.93
Std and var for number of iterations: 2.82 -- 7.93
Working with QI = LU on 16 processes...
Mean number of iterations: 14.75
Range of values for number of iterations: 21
Position of max/min number of iterations: 15 -- 0
Std and var for number of iterations: 6.59 -- 43.44
Std and var for number of iterations: 6.59 -- 43.44
Mean number of iterations went up from 5.00 to 14.75 for QI = LU!
Working with QI = IE on 1 processes...
Mean number of iterations: 5.00
Range of values for number of iterations: 0
Position of max/min number of iterations: 0 -- 0
Std and var for number of iterations: 0.00 -- 0.00
Std and var for number of iterations: 0.00 -- 0.00
Working with QI = IE on 2 processes...
Mean number of iterations: 5.50
Range of values for number of iterations: 1
Position of max/min number of iterations: 1 -- 0
Std and var for number of iterations: 0.50 -- 0.25
Std and var for number of iterations: 0.50 -- 0.25
Working with QI = IE on 4 processes...
Mean number of iterations: 6.50
Range of values for number of iterations: 3
Position of max/min number of iterations: 3 -- 0
Std and var for number of iterations: 1.12 -- 1.25
Std and var for number of iterations: 1.12 -- 1.25
Working with QI = IE on 8 processes...
Mean number of iterations: 8.50
Range of values for number of iterations: 7
Position of max/min number of iterations: 7 -- 0
Std and var for number of iterations: 2.29 -- 5.25
Std and var for number of iterations: 2.29 -- 5.25
Working with QI = IE on 16 processes...
Mean number of iterations: 13.12
Range of values for number of iterations: 17
Position of max/min number of iterations: 15 -- 0
Std and var for number of iterations: 5.24 -- 27.48
Std and var for number of iterations: 5.24 -- 27.48
Mean number of iterations went up from 5.00 to 13.12 for QI = IE!