My GSoC 2017 project was to implement a package for Julia to solve Ordinary Differential Equations using Neural Networks. The purpose of the project was to provide an additional DE solver using Neural Networks which has parallelism in time as the key advantage over other solvers which are iterative in nature. The project was based on research paper of Lagaris et al. 1997 which proposed the function approximation capabilities of neural networks (NNs) for solving differential equations. The project was a mixture of research as well as implementation aspects and still has a few parts left to work upon. I chose to work on this project as I have interest in mathematics and machine learning and it involved concepts of both the fields. The package uses DifferentialEquations.jl for the solver interface and KNet.jl for NN solver implementation.
After implementing the solver for ODEs (Ordinary Differential Equations) and systems of ODEs, the difficult part was to make the NN converge for the systems of ODEs on longer time domains. As there are a lot of factors involved in neural networks, like hidden layer width, number of hidden neurons, activations, weights etc., I relied on my machine learning background as well as the help from my mentors to experiment with most of the feasible settings of NN hyper-parameters and recording the accuracy of convergence and performance of the solver. Making the NNs converge for systems of ODEs is not as easy as it seems and took up most of the time for experimentation and tuning. Predicting the system of DEs solution with larger domain is still a challenge which needs to be worked upon.
For parallel implementation in time we use KnetArray (the array type used in KNet.jl) which uses CPU by default but GPU usage is also supported for parallelism but requires CUDNN driver installed to access GPU hardware.
using NeuralNetDiffEq
using Plots; plotly()
using DiffEqBase, ParameterizedFunctions
using DiffEqProblemLibrary, DiffEqDevTools
using Knet
linear = (t,u) -> (1.01*u)
(f::typeof(linear))(::Type{Val{:analytic}},t,u0) = u0*exp(1.01*t)
prob = ODEProblem(linear,1/2,(0.0,1.0))
sol = solve(prob,nnode(10),dt=1/10,iterations=10)
plot(sol,plot_analytic=true)
sol(0.232)
1-element Array{Any,1}:
0.625818
f = (t,u) -> (t^3 + 2*t + (t^2)*((1+3*(t^2))/(1+t+(t^3))) - u*(t + ((1+3*(t^2))/(1+t+t^3))))
(::typeof(f))(::Type{Val{:analytic}},t,u0) = u0*exp(-(t^2)/2)/(1+t+t^3) + t^2
prob2 = ODEProblem(f,1.0,(0.0,1.0))
sol2 = solve(prob2,nnode(10),dt=0.1,iterations=200)
plot(sol,plot_analytic=true)
sol(0.47)
1-element Array{Any,1}:
0.803109
f2 = (t,u) -> (-u/5 + exp(-t/5).*cos(t))
(::typeof(f2))(::Type{Val{:analytic}},t,u0) = exp(-t/5)*(u0 + sin(t))
prob3 = ODEProblem(f2,Float32(0.0),(Float32(0.0),Float32(2.0)))
sol3 = solve(prob3,nnode(10),dt=0.2,iterations=1000)
plot(sol,plot_analytic=true)
sol3([0.721])
1-element Array{Any,1}:
Any[0.574705]
f_2dlinear = (t,u) -> begin
du = Array{Any}(length(u))
for i in 1:length(u)
du[i] = 1.01*u[i]
end
return du
end
(f::typeof(f_2dlinear))(::Type{Val{:analytic}},t,u0) = u0*exp.(1.01*t)
prob_ode_2Dlinear = ODEProblem(f_2dlinear,rand(4,1),(0.0,1.0))
sol1 = solve(prob_ode_2Dlinear,nnode([10,50]),dt=0.1,iterations=100)
(:iteration,100,:loss,0.004670103680503722)
16.494870 seconds (90.08 M allocations: 6.204 GB, 5.82% gc time)
plot(sol1,plot_analytic=true)
function lotka_volterra(t,u)
du1 = 1.5 .* u[1] - 1.0 .* u[1].*u[2]
du2 = -3 .* u[2] + u[1].*u[2]
[du1,du2]
end
lotka_volterra (generic function with 1 method)
prob_ode_lotkavoltera = ODEProblem(lotka_volterra,Float32[1.0,1.0],(Float32(0.0),Float32(1.0)))
sol2 = solve(prob_ode_lotkavoltera,nnode([10,50]),dt=0.2,iterations=1000)
(:iteration,100,:loss,0.020173132003438572)
(:iteration,200,:loss,0.005130137452114811)
(:iteration,300,:loss,0.004812458584875589)
(:iteration,400,:loss,0.010083624565714974)
(:iteration,500,:loss,0.0025328170079611887)
(:iteration,600,:loss,0.007685579218433846)
(:iteration,700,:loss,0.005065291031504465)
(:iteration,800,:loss,0.005326863832044214)
(:iteration,900,:loss,0.00030436474139241827)
(:iteration,1000,:loss,0.0034853904995959094)
22.126081 seconds (99.65 M allocations: 5.923 GB, 5.21% gc time)
plot(sol2)
To show that the solver with current settings and hyper-parameters does not work for a larger domain (Eg 0-10) lotka_volterra
here is a graph:
prob_ode_lotkavoltera = ODEProblem(lotka_volterra,Float32[1.0,1.0],(Float32(0.0),Float32(5.0)))
sol3 = solve(prob_ode_lotkavoltera,nnode([10,50]),dt=0.2,iterations=1000)
plot(sol3)
However, the true solution should be oscillatory, indicating that the NN did not properly converge. To see more examples and experiment results you can check out my Jupyter notebooks here.
I would really want to thank my GSoC mentors Chris Rackauckas and Lyndon White for the help they provided in understanding mathematical as well as coding parts of the project. Also I would like to thank the Julia community in general for giving me opportunity to contribute and for sponsoring my JuliaCon 2017 trip which was awesome.