01 JULIA1 - 6A: Metaprogramming and macros (23:46)
01 JULIA1 - 6B: Interoperability with other languages (23:6)
01 JULIA1 - 6C: Performances and errors: profiling, debugging, introspection and exceptions (27:33)
01 JULIA1 - 6D: Parallel computation: multithreading, multiprocessing (20:3)
0106 Further Topics
Some stuff to set-up the environment..
julia> cd(@__DIR__)julia> using Pkgjulia> Pkg.activate(".") # If using a Julia version different than 1.10 please uncomment and run the following line (reproductibility guarantee will however be lost) # Pkg.resolve() # Pkg.instantiate() # run this if you didn't in Segment 01.01Activating project at `~/work/SPMLJ/SPMLJ/buildedDoc/01_-_JULIA1_-_Basic_Julia_programming`julia> using Randomjulia> Random.seed!(123)Random.TaskLocalRNG()julia> using InteractiveUtils # loaded automatically when working... interactively
Metaprogramming and macros
"running" some code include the following passages (roughly):
- parsing of the text defining the code and its translation in hierarchical expressions to the Abstract syntax Tree (AST) (syntax errors are caugth at this time)
- on the first instance required ("just in time") compilation of the AST expressions into object code (using the LLVM compiler)
- execution of the compiled object code
"Macros" in many other language (e.g. C or C++) refer to the possibility to "pre-process" the textual representation of the code statements before it is parsed. In julia instead it refers to the possibility to alter the expression once has already being parsed in the AST, allowing a greater expressivity as we are no longer limited by the parsing syntax
The AST is organised in a hierarchical tree of expressions where each element (including the operators) is a symbol For variables, you can use symbols to refer to the actual identifiers instad to the variable's value
Expressions themselves are objects representing unevaluated computer expressions
Expressions and symbols
julia> expr1 = Meta.parse("a = b + 2") # What the parser do when reading the source code. b doesn't need to actually been defined, it's just a namebinding without the reference to any object, not even `nothing`:(a = b + 2)julia> typeof(expr1) # expressions are first class objectsExprjulia> expr2 = :(a = b + 1):(a = b + 1)julia> expr3 = quote a = b + 1 endquote #= REPL[4]:1 =# a = b + 1 endjulia> expr3quote #= REPL[4]:1 =# a = b + 1 endjulia> dump(expr1) # The AST ! Note this is already a nested statement, an assignment of the result of an expression (the sum call between the symbol `:b` and 1) to the symbol `a`Expr head: Symbol = args: Array{Any}((2,)) 1: Symbol a 2: Expr head: Symbol call args: Array{Any}((3,)) 1: Symbol + 2: Symbol b 3: Int64 2julia> expr4 = Expr(:(=),:a,Expr(:call,:+,:b,1)) # The AST using the "Expr" constructor:(a = b + 1)julia> symbol1 = :(a) # as for expressions:ajulia> symbol2 = Meta.parse("a") # as for expressions:ajulia> symbol3 = Symbol("a") # specific for symbols only:ajulia> othsymbol = Symbol("aaa",10,"bbb"):aaa10bbbjulia> typeof(symbol1)Symbol
I can access any parts of my expression before evaluating it (indeed, that's what macro will do...)
julia> myASymbol = expr1.args[2].args[1]:+julia> expr1.args[2].args[1] = :(*):*julia> b = 2 # a # error, a not defined2julia> eval(expr1)4julia> a # here now is defined and it has an object associated... 4!4
The capability to evaluate expressions is very powerfull but due to obvious secutiry implications never evaluate expressions you aren't sure of their provenience. For example if you develop a Julia web app (e.g. using Genie.jl) never evaluate user provided expressions.
Note that evaluation of expressions happens always at global scope, even if it done inside a function:
julia> function foo() locVar = 1 expr = :(locVar + 1) return eval(expr) end # a = foo() # error locVar not definedfoo (generic function with 1 method)
To refer to the value of a variable rather than the identifier itself within an expression, interpolate the variable using the dollar sign:
julia> expr = :($a + b) # here the identifier 'a' has been replaced with its numerical value, `4`:(4 + b)julia> dump(expr)Expr head: Symbol call args: Array{Any}((3,)) 1: Symbol + 2: Int64 4 3: Symbol bjulia> eval(expr)6julia> a = 1010julia> eval(expr) # no changes6julia> b = 100100julia> eval(expr) # here it change, as it is at eval time that the identifier `b` is "replaced" with its value104
Macros
One of the best usage of macros is they allow package developers to provide a very flexible API to their package, suited for the specific needs of the package, making life easier for the users. Compare for example the API of JuMP with those of Pyomo to define model constraints !
Some examples of macros: MultiDimEquations.jl
- from:
@meq par1[d1 in DIM1, d2 in DIM2, dfix3] = par2[d1,d2]+par3[d1,d2] - to:
[par1[d1,d2,dfix3] = par2[d1,d2]+par3[d1,d2] for d1 in DIM1, d2 in DIM2]
- from:
@pipe 10 |> foo(_,a) |> foo2(b,_,c) |> foo3(_) - to:
foo3(foo2(b,foo(10,a),c))
Brodcasting (Base):
- from:
@. a + b * D^2 - to:
a .+ b .* D.^2
Defining a macro...
Like functions, but both the arguments and the returned output are expressions
julia> macro customLoop(controlExpr,workExpr) return quote for i in $controlExpr $workExpr end end end@customLoop (macro with 1 method)
Invoking a macro....
julia> a = 55julia> @customLoop 1:4 println(i) #note that "i" is in the macro1 2 3 4julia> @customLoop 1:a println(i)1 2 3 4 5julia> @customLoop 1:a if i > 3 println(i) end4 5julia> @customLoop ["apple", "orange", "banana"] println(i)apple orange bananajulia> @customLoop ["apple", "orange", "banana"] begin print("i: "); println(i) endi: apple i: orange i: bananajulia> @macroexpand @customLoop 1:4 println(i) # print what the macro does with the specific expressions providedquote #= REPL[1]:3 =# for var"#431#i" = 1:4 #= REPL[1]:4 =# Main.var"Main".println(var"#431#i") #= REPL[1]:5 =# end end
String macros (aka "non-standard string literals") Invoked with the syntax xxx" ...text..." or xxx""" ...multi-line text...""" where xxx is the name of the macro and the macro must be defined as macro xxx_str. Used to perform textual modification o nthe given text, for example this print the given text on a 8 characters
julia> macro print8_str(mystr) # input here is a string, not an expression limits = collect(1:8:length(mystr)) for (i,j) in enumerate(limits) st = j en = i==length(limits) ? length(mystr) : j+7 println(mystr[st:en]) end end@print8_str (macro with 1 method)julia> print8"123456789012345678"12345678 90123456 78julia> print8"""This is a text that once printed in 8 columns with terminal will be several lines. Ok, no rammar rules relating to carriage returns are emploied here..."""This is a text t hat once printed in 8 co lumns wi th termi nal will be seve ral line s. Ok, n o rammar rules r elating to carri age retu rns are emploied here...
While normally used to modify text, string macros are "true" macros:
julia> macro customLoop_str(str) exprs = Meta.parse(str) controlExpr,workExpr = exprs.args[1],exprs.args[2] return quote for i in $controlExpr $workExpr end end end@customLoop_str (macro with 1 method)julia> customLoop"""1:4; println(i)"""1 2 3 4
Interfacing with other languages
There are 3 ways to interface Julia with programs or libraries wrote in other languages. At the lowest level, Julia allows to directly interface with C or Fortran libraries, and this means, aside using directly libraries written in C, to be able to interface with any programming language that offer also a C interface (R, Python...) Using this low level C Interface, users have created specific packages to interface many languages using a simple, Julian-way syntax. We will see these interfaces for R and Python. Finally, at the highest level, many common packages of other languages have been already "interfaced", so that the user can use the Julia Package without even knowing that this is an interface for an other package, for example SymPy.jl is a large interface to the Python package SymPy.
Using C libraries
Let's start by seing how to use a C library. For this example to work you will need to have the GCC compiler installed on your machine First let's write the header and source C files and write them to the disk:
julia> cheader = """ extern int get5(); extern double mySum(float x, float y); """"extern int get5();\nextern double mySum(float x, float y);\n"julia> csource = """ int get5(){ return 5; } double mySum(float x, float y){ return x+y; } """"int get5(){\n return 5;\n}\n\ndouble mySum(float x, float y){\n return x+y;\n}\n"julia> open(f->write(f,cheader),"myclib.h","w") # We open a stream to file with the "w" parameter as for "writing", and we pass the stream to the anonymous function to actually write to the stream. If this funcitons is many lines of code, consider rewriting the `open` statement using a `do` block58julia> open(f->write(f,csource),"myclib.c","w")79
Now let's run the command to compile the C code we saved as shared library using gcc, a C compiler. The following example assume that GCC is installed in the machine where this example is run and available as gcc.
julia> compilationCommand1 = `gcc -o myclib.o -c myclib.c` # the actual compilation, note the backticks used to define a command`gcc -o myclib.o -c myclib.c`julia> compilationCommand2 = `gcc -shared -o libmyclib.so myclib.o -lm -fPIC` # the linking into a shared library`gcc -shared -o libmyclib.so myclib.o -lm -fPIC`julia> run(compilationCommand1)Process(`gcc -o myclib.o -c myclib.c`, ProcessExited(0))julia> run(compilationCommand2)Process(`gcc -shared -o libmyclib.so myclib.o -lm -fPIC`, ProcessExited(0))
This should have created the C library libmyclib.so on disk. Let's gonna use it:
julia> const myclib = joinpath(@__DIR__, "libmyclib.so") # we need the full path"/home/runner/work/SPMLJ/SPMLJ/buildedDoc/01_-_JULIA1_-_Basic_Julia_programming/libmyclib.so"
ccall arguments:
- A tuple with the funcion name to call and the library path. For both, if embedded in a variable, the variable must be set constant.
- The Julia type that map to the C type returned by the function.
int→Int32orInt64(or the easy-to remmeberCintalias)float→Float32(or theCfloatalias)double→Float64(or theCdoublealias)
- A tuple with the Julia types of the parameters passed to the C function
- Any other argument are the values of the parameter passed
julia> a = ccall((:get5,myclib), Int32, ())5julia> b = ccall((:mySum,myclib), Float64, (Float32,Float32), 2.5, 1.5)4.0
More details on calling C or Fortran code can be obtained in the official Julia documentation.
Using Python in Julia
The "default" way to use Python code in Julia is trough the PyCall.jl package. It automatically take care of convert between Python types (including numpy arrays) and Julia types (types that can not be converted automatically are converted to the generic PyObject type).
julia> ENV["PYTHON"] = "" # will force PyCall to download and use a "private to Julia" (conda based) version of Python. use "/path/to/python" if you want to reuse a version already installed on your system # using Pkg # Pkg.add("PyCall") # Pkg.build("PyCall")""julia> using PyCall
Embed short python snippets in Julia
julia> py""" def sumMyArgs (i, j): return i+j def getNthElement (vec,n): return vec[n] """julia> a = py"sumMyArgs"(3,4) # 7 - here we call the Python object (a function) with Julia parameters7julia> b = py"getNthElement"([1,2,3],1) # 2 - attention to the diffferent convention for starting arrays!. Note the Julia Array ahas been converted automatically to a Python list2julia> d = py"getNthElement([1,$a,3],1)" # 7 - here we interpolate the Python call7
Alternativly, use @pyinclude("pythonScript.py")
julia> pythonCode = """ def sumMyArgs (i, j, z): return i+j+z """"def sumMyArgs (i, j, z):\n return i+j+z\n"julia> open(f->write(f,pythonCode),"pythonScript.py","w")40julia> @pyinclude("pythonScript.py")julia> a = py"sumMyArgs"(3,4,5)12
Note thaat the 3 arguments definition of sumMyArgs has replaced the 3-arguments one. This would now error py"sumMyArgs"(3,4)
Use Python libraries
Add a package to the local Python installation using Conda:
julia> pyimport_conda("ezodf", "ezodf", "conda-forge") # pyimport_conda(module, package, channel)PyObject <module 'ezodf' from '/home/runner/.local/lib/python3.10/site-packages/ezodf/__init__.py'>julia> const ez = pyimport("ezodf") # Equiv. of Python `import ezodf as ez`PyObject <module 'ezodf' from '/home/runner/.local/lib/python3.10/site-packages/ezodf/__init__.py'>julia> destDoc = ez.newdoc(doctype="ods", filename="anOdsSheet.ods")PyObject <ezodf.document.PackagedDocument object at 0x7f7683b58b20>
Both ez and destDoc are PyObjects for which we can access attributes and call the methods using the usual obj.method() syntax as we would do in Python
julia> sheet = ez.Sheet("Sheet1", size=(10, 10))PyObject <ezodf.table.Table object at 0x7f7683b582b0>julia> destDoc.sheets.append(sheet) # dcell1 = sheet[(2,3)] # This would error because the index is a tuple. Let's use directly the `get(obj,key)` function instead:PyObject <ezodf.table.Table object at 0x7f7683b582b0>julia> dcell1 = get(sheet,(2,3)) # Equiv. of Python `dcell1 = sheet[(2,3)]`. Attention again to Python indexing from zero: this is cell "D3", not "B3" !PyObject <ezodf.cells.Cell object at 0x7f7683b58220>julia> dcell1.set_value("Hello")julia> get(sheet,"A9").set_value(10.5) # Equiv. of Python `sheet['A9'].set_value(10.5)`julia> destDoc.backup = falsefalsejulia> destDoc.save()
Using Julia in Python
Installation of the Python package PyJulia
PyJulia can be installed using pip, taking note that its name using pip is julia not PyJulia: $ python3 -m pip install --user julia
We can now open a Python terminal and initialise PyJulia to work with our Julia version:
>>> import julia
>>> julia.install() # Only once to set-up in julia the julia packages required by PyJuliaIf we have multiple Julia versions, we can specify the one to use in Python passing julia="/path/to/julia/binary/executable" (e.g. julia = "/home/myUser/lib/julia-1.1.0/bin/julia") to the install() function.
Running Julia libraries and code in Python
On each Python session we need to run the following code:
from julia import Julia
Julia(compiled_modules=False)This is a workaround to the common situation when the Python interpreter is statically linked to libpython, but it will slow down the interactive experience, as it will disable Julia packages pre-compilation, and every time we will use a module for the first time, this will need to be compiled first. Other, more efficient but also more complicate, workarounds are given in the package documentation, under the Troubleshooting section.
We can now direcltly load a Julia module, including Main, the global namespace of Julia’s interpreter, with from julia import ModuleToLoad and access the module objects directly or using the Module.evel() interface.
Add a Julia package...
>>> from julia import Pkg
>>> Pkg.add("BetaML")Of course we can add a package alternatively from within Julia
"Direct" calling of Julia functions...
>>> from julia import BetaML
>>> import numpy as np
>>> model = BetaML.buildForest([[1,10],[2,12],[12,1]],["a","a","b"])
>>> predictions = BetaML.predict(model,np.array([[2,9],[13,0]]))
>>> predictions
[{'b': 0.36666666666666664, 'a': 0.6333333333333333}, {'b': 0.7333333333333333, 'a': 0.26666666666666666}]Access using the eval() interface...
If we are using the jl.eval() interface, the objects we use must be already known to julia. To pass objects from Python to Julia, we can import the julia Main module (the root module in julia) and assign the needed variables, e.g.
>>> X_python = [1,2,3,2,4]
>>> from julia import Main
>>> Main.X_julia = X_python
>>> Main.eval('BetaML.gini(X_julia)')
0.7199999999999999
>>> Main.eval("""
... function makeProd(x,y)
... return x*y
... end
... """
... )
>>> Main.eval("makeProd(2,3)") # or Main.makeProd(2,3)For large scripts instead of using eval() we can equivalently use Main.include("aJuliaScript.jl")
Using R in Julia
To use R from within Julia we use the RCall package.
julia> ENV["R_HOME"] = "*" # # will force RCall to download and use a "private to Julia" (conda based) version of R. use "/path/to/R/directory" (e.g. `/usr/lib/R`) if you want to reuse a version already installed on your system # using Pkg # Pkg.add("RCall") # Pkg.build("RCall")"*"julia> using RCalljulia> R""" sumMyArgs <- function(i,j) i+j getNthElement <- function(vec,n) { return(vec[n]) } """RCall.RObject{RCall.ClosSxp} function (vec, n) { return(vec[n]) }julia> a = rcopy(R"sumMyArgs"(3,4)) # 7 - here we call the R object (a function) with Julia parameters7julia> b = rcopy(R"getNthElement"([1,2,3],1)) # 1 - no differences in array indexing here1julia> d = rcopy(R"as.integer(getNthElement(c(1,$a,3),2))") # 7 - here we interpolate the R call7julia> d = convert(Int64,R"getNthElement(c(1,$a,3),2)")7
While we don't have here the problem of different array indexing convention (both Julia and R start indexing arrays at 1), we have the "problem" that the output returned by using R"..." is not yet an exploitable Julia object but it remains as an RObject that we can convert with rcopy() or explicitly with convert(T,obj). Also, R elements are all floats by default, so if we need an integer in Julia we need to explicitly convert it, either in R or in Julia.
If the R code is on a script, we don't have here a sort of @Rinclude macro, so let's implement it ourselves by loading the file content as a file and evaluating it using the function reval provided by RCall:
julia> macro Rinclude(fname) quote rCodeString = read($fname,String) reval(rCodeString) nothing end end@Rinclude (macro with 1 method)julia> RCode = """ sumMyArgs <- function(i, j, z) i+j+z """"sumMyArgs <- function(i, j, z) i+j+z\n"julia> open(f->write(f,RCode),"RScript.R","w")37julia> @Rinclude("RScript.R")julia> a = rcopy(R"sumMyArgs"(3,4,5)) # 12 # a = rcopy(R"sumMyArgs"(3,4)) # error ! The 3-arguments version of `sumMyArgs` has _replaced_ the 2-arguments one12
Using Julia in R
Installation of the R package JuliaCall
JuliaCall can be installed from CRAN:
> install.packages("JuliaCall")
> library(JuliaCall)
install_julia()install_julia() will force the download of R and install a private copy of julia. If you prefer to use instead an existing version of julia and having R default to download a private version only if it can't find a version already installed, use julia_setup(installJulia = TRUE) instead of install_julia(), eventually passing the JULIA_HOME = "/path/to/julia/binary/executable/directory" (e.g. JULIA_HOME = "/home/myUser/lib/julia-1.7.0/bin") parameter to the julia_setup call.
JuliaCall depends for some things (like object conversion between Julia and R) from the Julia RCall package. If we don't already have it installed in Julia, it will try to install it automatically.
Running Julia libraries and code in R
On each R session we need to run the julia_setup function:
library(JuliaCall)
julia_setup() # If we have already downloaded a private version of Julia for R it will be retrieved automaticallyWe can now load a Julia module and access the module objects directly or using the Module.evel() interface.
Add a Julia package...
> julia_eval('using Pkg; Pkg.add("BetaML")')Of course we can add a package alternatively from within Julia
Let's load some data from R and do some work with this data in Julia:
> library(datasets)
> X <- as.matrix(sapply(iris[,1:4], as.numeric))
> y <- sapply(iris[,5], as.integer)Calling of Julia functions with julia_call...
With JuliaCall, differently than PyJulia, we can't call direclty the julia functions but we need to employ the R function julia_call("juliaFunction",args):
> julia_eval("using BetaML")
> yencoded <- julia_call("integerEncoder",y)
> ids <- julia_call("shuffle",1:length(y))
> Xs <- X[ids,]
> ys <- yencoded[ids]
> cOut <- julia_call("kmeans",Xs,3L) # kmeans expects K to be an integer
> y_hat <- sapply(cOut[1],as.integer)[,] # We need a vector, not a matrix
> acc <- julia_call("accuracy",y_hat,ys)
> acc
[1] 0.8933333Access using the eval() interface...
As alternative, we can embed Julia code directly in R using the julia_eval() function:
> kMeansR <- julia_eval('
+ function accFromKmeans(x,k,y_true)
+ cOut = kmeans(x,Int(k))
+ acc = accuracy(cOut[1],y_true)
+ return acc
+ end
+ ')We can then call the above function in R in one of the following three ways:
kMeansR(Xs,3,ys)julia_assign("Xs_julia", Xs); julia_assign("ys_julia", ys); julia_eval("accFromKmeans(Xs_julia,3,ys_julia)")julia_call("accFromKmeans",Xs,3,ys).
While other "convenience" functions are provided by the package, using julia_call or julia_assign followed by julia_eval should suffix to accomplish most of the task we may need in Julia.
Some performance tips
Type stability
"Type stable" functions guarantee to the compiler that given a certain method (i.e. with the arguments being of a given type) the object returned by the function is also of a certain fixed type. Type stability is fundamental to allow type inference continue across the function call stack.
julia> function f1(x) # Type unstable outVector = [1,2.0,"2"] if x < 0 return outVector[1] elseif x == 0 return outVector[2] else return outVector[3] end endf1 (generic function with 1 method)julia> function f2(x) # Type stable outVector = [1,convert(Int64,2.0),parse(Int64,"2")] if x < 0 return outVector[1] elseif x == 0 return outVector[2] else return outVector[3] end endf2 (generic function with 1 method)julia> a = f1(0)2.0julia> b = f1(1)"2"julia> typeof(a)Float64julia> typeof(b)Stringjulia> c = f2(0)2julia> d = f2(1)2julia> typeof(c)Int64julia> typeof(d)Int64julia> using BenchmarkToolsjulia> @btime f1(0) # 661 ns 6 allocations59.785 ns (5 allocations: 192 bytes) 2.0julia> @btime f2(0) # 55 ns 1 allocations38.702 ns (1 allocation: 80 bytes) 2julia> @code_warntype f1(0) # Body::AnyMethodInstance for Main.var"Main".f1(::Int64) from f1(x) @ Main.var"Main" REPL[1]:1 Arguments #self#::Core.Const(Main.var"Main".f1) x::Int64 Locals outVector::Vector{Any} Body::ANY 1 ─ (outVector = Base.vect(1, 2.0, "2")) │ %2 = (x < 0)::Bool └── goto #3 if not %2 2 ─ %4 = Base.getindex(outVector, 1)::ANY └── return %4 3 ─ %6 = (x == 0)::Bool └── goto #5 if not %6 4 ─ %8 = Base.getindex(outVector, 2)::ANY └── return %8 5 ─ %10 = Base.getindex(outVector, 3)::ANY └── return %10julia> @code_warntype f2(0) # Body::Int64MethodInstance for Main.var"Main".f2(::Int64) from f2(x) @ Main.var"Main" REPL[2]:1 Arguments #self#::Core.Const(Main.var"Main".f2) x::Int64 Locals outVector::Vector{Int64} Body::Int64 1 ─ %1 = Main.var"Main".convert(Main.var"Main".Int64, 2.0)::Core.Const(2) │ %2 = Main.var"Main".parse(Main.var"Main".Int64, "2")::Int64 │ (outVector = Base.vect(1, %1, %2)) │ %4 = (x < 0)::Bool └── goto #3 if not %4 2 ─ %6 = Base.getindex(outVector, 1)::Int64 └── return %6 3 ─ %8 = (x == 0)::Bool └── goto #5 if not %8 4 ─ %10 = Base.getindex(outVector, 2)::Int64 └── return %10 5 ─ %12 = Base.getindex(outVector, 3)::Int64 └── return %12
While in general it is NOT important to annotate function parameters for performance, it is important to annotate struct fields with concrete types
julia> abstract type Goo endjulia> struct Foo <: Goo x::Number endjulia> struct Boo <: Goo x::Int64 endjulia> function f1(o::Goo) return o.x +2 endf1 (generic function with 2 methods)julia> fobj = Foo(1)Main.var"Main".Foo(1)julia> bobj = Boo(1)Main.var"Main".Boo(1)julia> @btime f1($fobj) # 17.1 ns 0 allocations17.062 ns (0 allocations: 0 bytes) 3julia> @btime f1($bobj) # 2.8 ns 0 allocations2.785 ns (0 allocations: 0 bytes) 3
Here the same function under some argument types is type stable, under other argument types is not
julia> @code_warntype f1(fobj)MethodInstance for Main.var"Main".f1(::Main.var"Main".Foo) from f1(o::Main.var"Main".Goo) @ Main.var"Main" REPL[4]:1 Arguments #self#::Core.Const(Main.var"Main".f1) o::Main.var"Main".Foo Body::ANY 1 ─ %1 = Base.getproperty(o, :x)::NUMBER │ %2 = (%1 + 2)::ANY └── return %2julia> @code_warntype f1(bobj)MethodInstance for Main.var"Main".f1(::Main.var"Main".Boo) from f1(o::Main.var"Main".Goo) @ Main.var"Main" REPL[4]:1 Arguments #self#::Core.Const(Main.var"Main".f1) o::Main.var"Main".Boo Body::Int64 1 ─ %1 = Base.getproperty(o, :x)::Int64 │ %2 = (%1 + 2)::Int64 └── return %2
Avoid (non-constant) global variables
julia> g = 22julia> const cg = 1 # we can't change the _type_ of the object binded to a constant variable1julia> cg = 2 # we can rebind to an other object of the same type, but we get a warning # cg = 2.5 # this would error !WARNING: redefinition of constant Main.cg. This may fail, cause incorrect answers, or produce other errors. 2julia> f1(x,y) = x+yf1 (generic function with 3 methods)julia> f2(x) = x + gf2 (generic function with 1 method)julia> f3(x) = x + cgf3 (generic function with 1 method)julia> @btime f1(3,2)1.432 ns (0 allocations: 0 bytes) 5julia> @btime f2(3) # 22 times slower !!!17.909 ns (0 allocations: 0 bytes) 5julia> @btime f3(3) # as f11.432 ns (0 allocations: 0 bytes) 5
Loop arrays with the inner loop by rows
Julia is column mayor (differently than Python) so arrays of bits types are contiguous in memory across the different rows of the same column
julia> a = rand(1000,1000);julia> function f1(x) (R,C) = size(x) cum = 0.0 for r in 1:R for c in 1:C cum += x[r,c] end end return cum endf1 (generic function with 3 methods)julia> function f2(x) (R,C) = size(x) cum = 0.0 for c in 1:C for r in 1:R cum += x[r,c] end end return cum endf2 (generic function with 1 method)julia> @btime f1($a) # 2.3 ms 0 allocations853.972 μs (0 allocations: 0 bytes) 500220.6882968228julia> @btime f2($a) # 1.3 ms 0 allocations853.201 μs (0 allocations: 0 bytes) 500220.6882968189
Use low-level optimisation when possible
julia> function f1(x) s = 0.0 for i in 1:length(x) s += i * x[i] end return s endf1 (generic function with 3 methods)julia> function f2(x) s = 0.0 for i in 1:length(x) @inbounds s += i * x[i] # remove bound checks end return s endf2 (generic function with 1 method)julia> function f3(x) s = 0.0 @simd for i in 1:length(x) # tell compiler it is allowed to run the loop in whatever order, allowing in-thread paralllelism of modern CPUs s += i * x[i] end return s endf3 (generic function with 1 method)julia> x = rand(10000);julia> @btime f1($x)8.529 μs (0 allocations: 0 bytes) 2.5231610262164485e7julia> @btime f2($x)8.529 μs (0 allocations: 0 bytes) 2.5231610262164485e7julia> @btime f3($x)5.121 μs (0 allocations: 0 bytes) 2.5231610262164526e7julia> X = rand(100,20);julia> function f1(x) s = 0.0 for i in 1:size(x,1) s += sum(x[i,:]) end return s endf1 (generic function with 3 methods)julia> function f2(x) s = 0.0 @views for i in 1:size(x,1) s += sum(x[i,:]) # the slice operator copy the data.. the views macro force to have instead to have a view (reference) end return s endf2 (generic function with 1 method)julia> @btime f1($X)4.068 μs (100 allocations: 21.88 KiB) 1013.9993057955776julia> @btime f2($X)1.212 μs (0 allocations: 0 bytes) 1013.9993057955776
Attention that while the @views macro "save time" by not copying the data, the resulting array has a pretty messy layout. If you need to use it for many subsequent operations it may be more efficient to "pay" the copy cost once and then have an array with a nicelly continuous block of memory..
julia> function f1(x,y) if x+y > 100 return x + y + 2 else return x + 1 end endf1 (generic function with 3 methods)julia> @inline function f2(x,y) # the function is "inlined", its whole definition copied at each calling place rather than being called if x+y > 100 return x + y + 2 else return x + 1 end endf2 (generic function with 2 methods)julia> function f3(y) s = 0.0 for i in 2:y s += f1(i,i-1) s += f1(i,i) end return s endf3 (generic function with 1 method)julia> function f4(y) s = 0.0 for i in 2:y s += f2(i,i-1) s += f2(i,i) end return s endf4 (generic function with 1 method)julia> x = 10001000julia> @btime f3($x)1.708 μs (0 allocations: 0 bytes) 2.002396e6julia> @btime f4($x)1.691 μs (0 allocations: 0 bytes) 2.002396e6
But attention! Not always a good idea:
julia> function f3(y) s = 0.0 for i in 2:y s += sum(f1(a,a-1) for a in 2:i) end return s endf3 (generic function with 1 method)julia> function f4(y) s = 0.0 for i in 2:y s += sum(f2(a,a-1) for a in 2:i) end return s endf4 (generic function with 1 method)julia> x = 10001000julia> @btime f3($x)79.488 μs (0 allocations: 0 bytes) 3.3359915e8julia> @btime f4($x)79.338 μs (0 allocations: 0 bytes) 3.3359915e8
Note that the Julia compiles already inline small functions automatically when it thinks it will improve performances
Profiling the code to discover bootlenecks
We already see @btime and @benchmark from the package BenchmarkTools.jl Remember to quote the global variables used as parameter of your function with the dollar sign to have accurate benchmarking of the function execution. Julia provide the macro @time but we should run on a second call to a given function (with a certain parameter types) or it will include compilation time in its output:
julia> function fb(x) out = Union{Int64,Float64}[1,2.0,3] push!(out,4) if x > 10 if ( x > 100) return [out[1],out[2]] |> sum else return [out[2],out[3]] |> sum end else return [out[1],out[3]] |> sum end endfb (generic function with 1 method)julia> @time fb(3)0.000003 seconds (3 allocations: 256 bytes) 4julia> @time fb(3)0.000002 seconds (3 allocations: 256 bytes) 4
We can use @profile function(x,y) to use a sample-based profiling
julia> using Profile # in the stdlibjulia> function foo(n) a = rand(n,n) b = a + a c = b * b return c endfoo (generic function with 2 methods)julia> @profile (for i = 1:100; foo(1000); end) # too fast otherwisejulia> Profile.print() # on my pc: 243 rand, 174 the sum, 439 the matrix productOverhead ╎ [+additional indent] Count File:Line; Function ========================================================= ╎205 @Base/client.jl:552; _start() ╎ 205 @Base/client.jl:318; exec_options(opts::Base.JLOpti... ╎ 205 @Base/Base.jl:495; include(mod::Module, _path::St... ╎ 205 @Base/loading.jl:2130; _include(mapexpr::Function, ... ╎ 205 @Base/loading.jl:2070; include_string(mapexpr::type... ╎ 205 @Base/boot.jl:385; eval ╎ ╎ 205 ...r/src/makedocs.jl:241; kwcall(::@NamedTuple{sitena... ╎ ╎ 205 ...r/src/makedocs.jl:247; #makedocs#81 ╎ ╎ 205 @Base/file.jl:112; cd(f::Documenter.var"#82#8... ╎ ╎ 205 .../src/makedocs.jl:247; #82 ╎ ╎ 205 @Base/env.jl:256; withenv(::Documenter.var"... ╎ ╎ ╎ 205 ...src/makedocs.jl:248; #83 ╎ ╎ ╎ 205 ...es/Selectors.jl:170; dispatch(::Type{Document... ╎ ╎ ╎ 205 ...er_pipeline.jl:222; runner(::Type{Documente... ╎ ╎ ╎ 205 ...er_pipeline.jl:22; expand(doc::Documenter.... ╎ ╎ ╎ 205 .../Selectors.jl:170; dispatch(::Type{Docume... ╎ ╎ ╎ ╎ 205 ...r_pipeline.jl:846; runner(::Type{Documen... ╎ ╎ ╎ ╎ 205 .../IOCapture.jl:72; kwcall(::@NamedTuple{... ╎ ╎ ╎ ╎ 205 ...IOCapture.jl:116; capture(f::Documente... ╎ ╎ ╎ ╎ 205 ...e/logging.jl:627; with_logger ╎ ╎ ╎ ╎ 205 .../logging.jl:515; with_logstate(f::Fu... ╎ ╎ ╎ ╎ ╎ 205 ...OCapture.jl:119; (::IOCapture.var"#... ╎ ╎ ╎ ╎ ╎ 205 ...pipeline.jl:847; (::Documenter.var"... ╎ ╎ ╎ ╎ ╎ 205 @Base/file.jl:112; cd(f::Documenter.... ╎ ╎ ╎ ╎ ╎ 205 ...ipeline.jl:848; #62 ╎ ╎ ╎ ╎ ╎ 205 ...e/boot.jl:385; eval ╎ ╎ ╎ ╎ ╎ ╎ 205 REPL[3]:1; top-level scope ╎ ╎ ╎ ╎ ╎ ╎ 205 ...rofile.jl:27; macro expansion ╎ ╎ ╎ ╎ ╎ ╎ 205 REPL[3]:1; macro expansion ╎ ╎ ╎ ╎ ╎ ╎ 87 REPL[2]:2; foo(n::Int64) ╎ ╎ ╎ ╎ ╎ ╎ 87 ...Random.jl:279; rand ╎ ╎ ╎ ╎ ╎ ╎ ╎ 87 ...Random.jl:291; rand ╎ ╎ ╎ ╎ ╎ ╎ ╎ 87 ...Random.jl:290; rand ╎ ╎ ╎ ╎ ╎ ╎ ╎ 2 ...e/boot.jl:494; Array ╎ ╎ ╎ ╎ ╎ ╎ ╎ 2 ...e/boot.jl:487; Array 2╎ ╎ ╎ ╎ ╎ ╎ ╎ 2 ...e/boot.jl:479; Array ╎ ╎ ╎ ╎ ╎ ╎ ╎ 85 ...Random.jl:269; rand! ╎ ╎ ╎ ╎ ╎ ╎ ╎ 85 ...roSimd.jl:293; rand! ╎ ╎ ╎ ╎ ╎ ╎ ╎ 85 ...roSimd.jl:142; xoshiro_bulk ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 6 ...roSimd.jl:249; xoshiro_bulk_si... 6╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 6 ...roSimd.jl:71; _plus ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...roSimd.jl:252; xoshiro_bulk_si... 1╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...roSimd.jl:77; _xor ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...roSimd.jl:253; xoshiro_bulk_si... 1╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...roSimd.jl:77; _xor ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...roSimd.jl:254; xoshiro_bulk_si... 1╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...roSimd.jl:77; _xor ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...roSimd.jl:255; xoshiro_bulk_si... 1╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...roSimd.jl:77; _xor ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 3 ...roSimd.jl:256; xoshiro_bulk_si... 3╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 3 ...roSimd.jl:55; _rotl45 ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 72 ...roSimd.jl:257; xoshiro_bulk_si... ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 17 ...ointer.jl:146; unsafe_store! 17╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 17 ...ointer.jl:146; unsafe_store! 55╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 55 ...roSimd.jl:108; _bits2float ╎ ╎ ╎ ╎ ╎ ╎ 79 REPL[2]:3; foo(n::Int64) ╎ ╎ ╎ ╎ ╎ ╎ 79 ...aymath.jl:16; +(A::Matrix{Flo... ╎ ╎ ╎ ╎ ╎ ╎ ╎ 79 ...adcast.jl:892; broadcast_prese... ╎ ╎ ╎ ╎ ╎ ╎ ╎ 79 ...adcast.jl:903; materialize ╎ ╎ ╎ ╎ ╎ ╎ ╎ 79 ...adcast.jl:928; copy ╎ ╎ ╎ ╎ ╎ ╎ ╎ 56 ...adcast.jl:956; copyto! ╎ ╎ ╎ ╎ ╎ ╎ ╎ 56 ...adcast.jl:1003; copyto! 1╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...mdloop.jl:0; macro expansion 1╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 1 ...mdloop.jl:75; macro expansion ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 54 ...mdloop.jl:77; macro expansion ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 54 ...adcast.jl:1004; macro expansion ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 22 ...adcast.jl:636; getindex ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 18 ...adcast.jl:681; _broadcast_geti... ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 18 ...adcast.jl:705; _getindex ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ +1 18 ...adcast.jl:675; _broadcast_geti... ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ +2 18 ...sional.jl:696; getindex 18╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ +3 18 ...ntials.jl:14; getindex ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 4 ...adcast.jl:682; _broadcast_geti... ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 4 ...adcast.jl:709; _broadcast_geti... 4╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ +1 4 .../float.jl:409; + ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 32 ...sional.jl:698; setindex! 32╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 32 .../array.jl:1024; setindex! ╎ ╎ ╎ ╎ ╎ ╎ ╎ 23 ...adcast.jl:223; similar ╎ ╎ ╎ ╎ ╎ ╎ ╎ 23 ...adcast.jl:224; similar ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 23 ...tarray.jl:873; similar ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 23 ...tarray.jl:874; similar ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 23 ...e/boot.jl:494; Array ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 23 ...e/boot.jl:487; Array 23╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 23 ...e/boot.jl:479; Array ╎ ╎ ╎ ╎ ╎ ╎ 39 REPL[2]:4; foo(n::Int64) ╎ ╎ ╎ ╎ ╎ ╎ 39 ...matmul.jl:113; *(A::Matrix{Flo... ╎ ╎ ╎ ╎ ╎ ╎ ╎ 2 .../array.jl:420; similar ╎ ╎ ╎ ╎ ╎ ╎ ╎ 2 ...e/boot.jl:487; Array 2╎ ╎ ╎ ╎ ╎ ╎ ╎ 2 ...e/boot.jl:479; Array ╎ ╎ ╎ ╎ ╎ ╎ ╎ 37 ...matmul.jl:237; mul! ╎ ╎ ╎ ╎ ╎ ╎ ╎ 37 ...matmul.jl:263; mul! ╎ ╎ ╎ ╎ ╎ ╎ ╎ 37 ...matmul.jl:352; generic_matmatmul! ╎ ╎ ╎ ╎ ╎ ╎ ╎ 37 ...matmul.jl:605; gemm_wrapper!(C... 37╎ ╎ ╎ ╎ ╎ ╎ ╎ 37 ...c/blas.jl:1524; gemm!(transA::C... ╎5928 @Base/task.jl:675; task_done_hook(t::Task) ╎ 5928 @Base/task.jl:994; wait() 5927╎ 5928 @Base/task.jl:985; poptask(W::Base.IntrusiveLinke... Total snapshots: 7904. Utilization: 25% across all threads and tasks. Use the `groupby` kwarg to break down by thread and/or task.julia> Profile.clear()
Introspection and debugging
To discover problems on the code more in general we can use several introspection functions that Julia provide us (some of which we have already saw):
julia> # @less rand(3) # Show the source code of the specific method invoked - use `q` to quit # @edit rand(3) # Like @loss but it opens the source code in an editor methods(foo)# 2 methods for generic function "foo" from Main.var"Main": [1] foo() @ REPL[1]:1 [2] foo(n) @ REPL[2]:1julia> @which foo(2) # which method am I using when I call foo with an integer?foo(n) @ Main.var"Main" REPL[2]:1julia> typeof(a)Matrix{Float64} (alias for Array{Float64, 2})julia> eltype(a)Float64julia> isa(1.2, Number)truejulia> fieldnames(Foo)(:x,)julia> dump(fobj)Main.var"Main".Foo x: Int64 1julia> names(Main, all=false) # available (e.g. exported) identifiers of a given module13-element Vector{Symbol}: :Base :Core :LESSONS_ROOTDIR :LESSONS_ROOTDIR_TMP :LESSONS_SUBDIR :MAKE_PDF :Main :include_sandbox :link_example :literate_directory :makeList :preprocess :rdirjulia> sizeof(2) # bytes8julia> typemin(Int64)-9223372036854775808julia> typemax(Int64)9223372036854775807julia> bitstring(2)"0000000000000000000000000000000000000000000000000000000000000010"
Various low-level interpretation of an expression
julia> @code_native foo(3).text .file "foo" .globl julia_foo_37825 # -- Begin function julia_foo_37825 .p2align 4, 0x90 .type julia_foo_37825,@function julia_foo_37825: # @julia_foo_37825 ; ┌ @ REPL[2]:1 within `foo` # %bb.0: # %top push rbp mov rbp, rsp push r15 push r14 push r12 push rbx sub rsp, 32 vxorps xmm0, xmm0, xmm0 vmovaps xmmword ptr [rbp - 64], xmm0 mov qword ptr [rbp - 48], 0 #APP mov rax, qword ptr fs:[0] #NO_APP mov rdx, rdi lea rcx, [rbp - 64] movabs rdi, 140147859060880 ; │ @ REPL[2]:2 within `foo` ; │┌ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/Random.jl:279 within `rand` @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/Random.jl:291 @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/Random.jl:290 ; ││┌ @ boot.jl:494 within `Array` @ boot.jl:487 @ boot.jl:479 mov rsi, rdx mov r12, qword ptr [rax - 8] mov qword ptr [rbp - 64], 4 mov rax, qword ptr [r12] mov qword ptr [rbp - 56], rax movabs rax, 140148023411424 mov qword ptr [r12], rcx call rax ; ││└ ; ││┌ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/Random.jl:269 within `rand!` @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:293 ; │││┌ @ essentials.jl:10 within `length` mov rbx, qword ptr [rax + 8] ; │││└ ; │││┌ @ abstractarray.jl:1234 within `pointer` ; ││││┌ @ pointer.jl:65 within `unsafe_convert` mov r14, qword ptr [rax] ; ││└└└ ; ││┌ @ boot.jl:494 within `Array` @ boot.jl:487 @ boot.jl:479 mov r15, rax ; ││└ ; ││┌ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/Random.jl:269 within `rand!` @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:293 ; │││┌ @ int.jl:88 within `*` shl rbx, 3 ; │││└ ; │││┌ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:141 within `xoshiro_bulk` ; ││││┌ @ operators.jl:425 within `>=` ; │││││┌ @ int.jl:514 within `<=` cmp rbx, 64 ; ││││└└ jl .LBB0_2 # %bb.1: # %L11 ; ││││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:142 within `xoshiro_bulk` movabs rax, offset j_xoshiro_bulk_simd_37827 mov qword ptr [rbp - 48], r15 mov rdi, r14 mov rsi, rbx call rax ; ││││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:143 within `xoshiro_bulk` ; ││││┌ @ int.jl:86 within `-` sub rbx, rax ; ││││└ ; ││││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:144 within `xoshiro_bulk` ; ││││┌ @ pointer.jl:282 within `+` add r14, rax .LBB0_2: # %L17 ; ││││└ ; ││││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:146 within `xoshiro_bulk` ; ││││┌ @ operators.jl:276 within `!=` ; │││││┌ @ promotion.jl:521 within `==` test rbx, rbx ; ││││└└ je .LBB0_4 # %bb.3: # %L22 ; ││││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:147 within `xoshiro_bulk` movabs rax, offset j_xoshiro_bulk_nosimd_37828 mov qword ptr [rbp - 48], r15 mov rdi, r14 mov rsi, rbx call rax .LBB0_4: # %L24 ; │└└└ ; │ @ REPL[2]:3 within `foo` movabs rax, offset "j_+_37829" mov qword ptr [rbp - 48], r15 mov rdi, r15 mov rsi, r15 call rax ; │ @ REPL[2]:4 within `foo` movabs rcx, offset "j_*_37830" mov qword ptr [rbp - 48], rax mov rdi, rax mov rsi, rax call rcx mov rcx, qword ptr [rbp - 56] mov qword ptr [r12], rcx ; │ @ REPL[2]:5 within `foo` add rsp, 32 pop rbx pop r12 pop r14 pop r15 pop rbp ret .Lfunc_end0: .size julia_foo_37825, .Lfunc_end0-julia_foo_37825 ; └ # -- End function .section ".note.GNU-stack","",@progbitsjulia> @code_llvm foo(3); @ REPL[2]:1 within `foo` define nonnull {}* @julia_foo_37873(i64 signext %0) #0 { top: %gcframe3 = alloca [3 x {}*], align 16 %gcframe3.sub = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe3, i64 0, i64 0 %1 = bitcast [3 x {}*]* %gcframe3 to i8* call void @llvm.memset.p0i8.i64(i8* align 16 %1, i8 0, i64 24, i1 true) %thread_ptr = call i8* asm "movq %fs:0, $0", "=r"() #7 %tls_ppgcstack = getelementptr i8, i8* %thread_ptr, i64 -8 %2 = bitcast i8* %tls_ppgcstack to {}**** %tls_pgcstack = load {}***, {}**** %2, align 8 ; @ REPL[2]:2 within `foo` ; ┌ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/Random.jl:279 within `rand` @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/Random.jl:291 @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/Random.jl:290 ; │┌ @ boot.jl:494 within `Array` @ boot.jl:487 @ boot.jl:479 %3 = bitcast [3 x {}*]* %gcframe3 to i64* store i64 4, i64* %3, align 16 %4 = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe3, i64 0, i64 1 %5 = bitcast {}** %4 to {}*** %6 = load {}**, {}*** %tls_pgcstack, align 8 store {}** %6, {}*** %5, align 8 %7 = bitcast {}*** %tls_pgcstack to {}*** store {}** %gcframe3.sub, {}*** %7, align 8 %8 = call nonnull {}* inttoptr (i64 140148023411424 to {}* ({}*, i64, i64)*)({}* inttoptr (i64 140147859060880 to {}*), i64 %0, i64 %0) ; │└ ; │┌ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/Random.jl:269 within `rand!` @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:293 ; ││┌ @ abstractarray.jl:1234 within `pointer` ; │││┌ @ pointer.jl:65 within `unsafe_convert` %9 = bitcast {}* %8 to i8** %arrayptr = load i8*, i8** %9, align 8 %10 = ptrtoint i8* %arrayptr to i64 ; ││└└ ; ││┌ @ essentials.jl:10 within `length` %11 = bitcast {}* %8 to { i8*, i64, i16, i16, i32 }* %arraylen_ptr = getelementptr inbounds { i8*, i64, i16, i16, i32 }, { i8*, i64, i16, i16, i32 }* %11, i64 0, i32 1 %arraylen = load i64, i64* %arraylen_ptr, align 8 ; ││└ ; ││┌ @ int.jl:88 within `*` %12 = shl i64 %arraylen, 3 ; ││└ ; ││┌ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:141 within `xoshiro_bulk` ; │││┌ @ operators.jl:425 within `>=` ; ││││┌ @ int.jl:514 within `<=` %13 = icmp slt i64 %12, 64 ; │││└└ br i1 %13, label %L17, label %L11 L11: ; preds = %top %14 = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe3, i64 0, i64 2 store {}* %8, {}** %14, align 16 ; │││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:142 within `xoshiro_bulk` %15 = call i64 @j_xoshiro_bulk_simd_37875(i64 zeroext %10, i64 signext %12) ; │││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:143 within `xoshiro_bulk` ; │││┌ @ int.jl:86 within `-` %16 = sub i64 %12, %15 ; │││└ ; │││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:144 within `xoshiro_bulk` ; │││┌ @ pointer.jl:282 within `+` %17 = getelementptr i8, i8* %arrayptr, i64 %15 %18 = ptrtoint i8* %17 to i64 ; ││││┌ @ essentials.jl:517 within `oftype` ; │││││┌ @ pointer.jl:26 within `convert` ; ││││││┌ @ boot.jl:800 within `Ptr` br label %L17 L17: ; preds = %L11, %top %value_phi = phi i64 [ %18, %L11 ], [ %10, %top ] %value_phi1 = phi i64 [ %16, %L11 ], [ %12, %top ] ; │││└└└└ ; │││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:146 within `xoshiro_bulk` ; │││┌ @ operators.jl:276 within `!=` ; ││││┌ @ promotion.jl:521 within `==` %.not = icmp eq i64 %value_phi1, 0 ; │││└└ br i1 %.not, label %L24, label %L22 L22: ; preds = %L17 %19 = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe3, i64 0, i64 2 store {}* %8, {}** %19, align 16 ; │││ @ /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Random/src/XoshiroSimd.jl:147 within `xoshiro_bulk` call void @j_xoshiro_bulk_nosimd_37876(i64 zeroext %value_phi, i64 signext %value_phi1) br label %L24 L24: ; preds = %L22, %L17 %20 = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe3, i64 0, i64 2 store {}* %8, {}** %20, align 16 ; └└└ ; @ REPL[2]:3 within `foo` %21 = call nonnull {}* @"j_+_37877"({}* nonnull %8, {}* nonnull %8) store {}* %21, {}** %20, align 16 ; @ REPL[2]:4 within `foo` %22 = call nonnull {}* @"j_*_37878"({}* nonnull %21, {}* nonnull %21) %23 = load {}*, {}** %4, align 8 %24 = bitcast {}*** %tls_pgcstack to {}** store {}* %23, {}** %24, align 8 ; @ REPL[2]:5 within `foo` ret {}* %22 }julia> @code_typed foo(3)CodeInfo( 1 ── %1 = $(Expr(:foreigncall, :(:jl_alloc_array_2d), Matrix{Float64}, svec(Any, Int64, Int64), 0, :(:ccall), Matrix{Float64}, Core.Argument(2), Core.Argument(2), Core.Argument(2), Core.Argument(2)))::Matrix{Float64} │ %2 = $(Expr(:gc_preserve_begin, :(%1))) │ %3 = $(Expr(:foreigncall, :(:jl_array_ptr), Ptr{Float64}, svec(Any), 0, :(:ccall), :(%1)))::Ptr{Float64} │ %4 = Base.bitcast(Ptr{UInt8}, %3)::Ptr{UInt8} │ %5 = Base.arraylen(%1)::Int64 │ %6 = Base.mul_int(%5, 8)::Int64 │ %7 = Random.XoshiroSimd.Float64::Type{Float64} │ %8 = Random.XoshiroSimd._bits2float::typeof(Random.XoshiroSimd._bits2float) │ %9 = Base.sle_int(64, %6)::Bool └─── goto #3 if not %9 2 ── %11 = invoke Random.XoshiroSimd.xoshiro_bulk_simd($(QuoteNode(Random.TaskLocalRNG()))::Random.TaskLocalRNG, %4::Ptr{UInt8}, %6::Int64, %7::Type{Float64}, $(QuoteNode(Val{8}()))::Val{8}, %8::typeof(Random.XoshiroSimd._bits2float))::Int64 │ %12 = Base.sub_int(%6, %11)::Int64 │ %13 = Core.bitcast(Core.UInt, %4)::UInt64 │ %14 = Base.bitcast(UInt64, %11)::UInt64 │ %15 = Base.add_ptr(%13, %14)::UInt64 └─── %16 = Core.bitcast(Ptr{UInt8}, %15)::Ptr{UInt8} 3 ┄─ %17 = φ (#2 => %16, #1 => %4)::Ptr{UInt8} │ %18 = φ (#2 => %12, #1 => %6)::Int64 │ %19 = (%18 === 0)::Bool │ %20 = Base.not_int(%19)::Bool └─── goto #5 if not %20 4 ── invoke Random.XoshiroSimd.xoshiro_bulk_nosimd($(QuoteNode(Random.TaskLocalRNG()))::Random.TaskLocalRNG, %17::Ptr{UInt8}, %18::Int64, %7::Type{Float64}, %8::typeof(Random.XoshiroSimd._bits2float))::Any 5 ┄─ goto #6 6 ── $(Expr(:gc_preserve_end, :(%2))) └─── goto #7 7 ── goto #8 8 ── goto #9 9 ── goto #10 10 ─ goto #11 11 ─ %30 = invoke Main.var"Main".:+(%1::Matrix{Float64}, %1::Matrix{Float64})::Matrix{Float64} │ %31 = invoke Main.var"Main".:*(%30::Matrix{Float64}, %30::Matrix{Float64})::Matrix{Float64} └─── return %31 ) => Matrix{Float64}julia> @code_lowered foo(3)CodeInfo( 1 ─ a = Main.var"Main".rand(n, n) │ b = a + a │ c = b * b └── return c )
We can use a debugger, like e.g. the one integrated in Juno or VSCode. Graphical debuggers allow to put a breakpoint on some specific line of code, run the code in debug mode (yes, it will be slower), let the program arrive to the breakpoint and inspect the state of the system at that point of the code, including local variables. In Julia we can also change the program interactively ! Other typycal functions are running a single line, running inside a function, running until the current function return, ecc..
Runtime exceptions
As many (all?) languages, Julia when "finds" an error issues an exception, that if it is not caugth at higher level in the call stack (i.e. recognised and handled) lead to an error and return to the prompt or termination of the script (and rarely with the Julia process crashing altogether).
The idea is that we try some potentially dangerous code and if some error is raised in this code we catch it and handle it.
julia> function customIndex(vect,idx;toReturn=0) try vect[idx] catch e if isa(e,BoundsError) return toReturn end rethrow(e) end endcustomIndex (generic function with 1 method)julia> a = [1,2,3] # a[4] # Error ("BoundsError" to be precise)3-element Vector{Int64}: 1 2 3julia> customIndex(a,4)0
Note that handling exceptions is computationally expensive, so do not use exceptions in place of conditional statements
Parallel computation
Finally one note on parallel computation. We see only some basic usage of multithreading and multiprocesses in this course, but with Julia it is relativelly easy to parallelise the code either using multiple threads or multiple processes. What's the difference ?
- multithread
- advantages: computationally "cheap" to create (the memory is shared)
- disadvantages: limited to the number of cores within a CPU, require attention in not overwriting the same memory or doing it at the intended order ("data race"), we can't add threads dynamically (within a script)
- multiprocesses
- advantages: unlimited number, can be run in different CPUs of the same machine or differnet nodes of a cluster, even using SSH on different networks, we can add processes from within our code with
addprocs(nToAdd) - disadvantages: the memory being copied (each process will have its own memory) are computationally expensive (you need to have a gain higher than the cost on setting a new process) and require attention to select which memory a given process will need to "bring with it" for its functionality
- advantages: unlimited number, can be run in different CPUs of the same machine or differnet nodes of a cluster, even using SSH on different networks, we can add processes from within our code with
Note that if you are reading this document on the github pages, this script is compiled using GitHub actions where a single thread and process are available, so you will not see performance gains.
Multithreading
It is not possible to add threads dinamically, either we have to start Julia with the parameter -t (e.g. -t 8 or -t auto) in the command line or use the VSCode Julia externsion setting Julia: Num Threads
julia> function inner(x) s = 0.0 for i in 1:x for j in 1:i if j%2 == 0 s += j else s -= j end end end return s endinner (generic function with 1 method)julia> function parentSingleThread(x,y) toTest = x .+ (1:y) out = zeros(length(toTest)) for i in 1:length(toTest) out[i] = inner(toTest[i]) end return out endparentSingleThread (generic function with 1 method)julia> function parentThreaded(x,y) toTest = x .+ (1:y) out = zeros(length(toTest)) Threads.@threads for i in 1:length(toTest) out[i] = inner(toTest[i]) end return out endparentThreaded (generic function with 1 method)julia> x = 100100julia> y = 2020julia> str = parentSingleThread(x,y)20-element Vector{Float64}: -51.0 0.0 -52.0 0.0 -53.0 0.0 -54.0 0.0 -55.0 0.0 -56.0 0.0 -57.0 0.0 -58.0 0.0 -59.0 0.0 -60.0 0.0julia> mtr = parentThreaded(x,y)20-element Vector{Float64}: -51.0 0.0 -52.0 0.0 -53.0 0.0 -54.0 0.0 -55.0 0.0 -56.0 0.0 -57.0 0.0 -58.0 0.0 -59.0 0.0 -60.0 0.0julia> str == mtr # truetruejulia> Threads.nthreads() # 4 in my case4julia> Threads.threadid()1julia> @btime parentSingleThread(100,20) # 140 μs on my machine105.357 μs (1 allocation: 224 bytes) 20-element Vector{Float64}: -51.0 0.0 -52.0 0.0 -53.0 0.0 -54.0 0.0 -55.0 0.0 -56.0 0.0 -57.0 0.0 -58.0 0.0 -59.0 0.0 -60.0 0.0julia> @btime parentThreaded(100,20) # 47 μs38.491 μs (22 allocations: 2.42 KiB) 20-element Vector{Float64}: -51.0 0.0 -52.0 0.0 -53.0 0.0 -54.0 0.0 -55.0 0.0 -56.0 0.0 -57.0 0.0 -58.0 0.0 -59.0 0.0 -60.0 0.0
Multiprocessing
NOTE The code on multiprocessing is commented out (not executed, so you don't see the output) as GitHub actions (that are used to run this code and render the web pages you are reading) have problems running multi-process functions:
using Distributed # from the Standard Library
addprocs(3) # 2,3,4The first process is considered a sort of "master" process, the other one are the "workers" We can add processes on other machines by providing the SSH connection details directly in the addprocs() call (Julia must be installed on that machines as well) We can alternativly start Julia directly with n worker processes using the armument -p n in the command line.
println("Worker pids: ")
for pid in workers() # return a vector to the pids
println(pid) # 2,3,4
end
rmprocs(workers()[1]) # remove process pid 2
println("Worker pids: ")
for pid in workers()
println(pid) # 3,4 are left
end
@everywhere begin using Distributed end # this is needed only in GitHub action
@everywhere println(myid()) # 4,3Run heavy tasks in parallel
using Distributed, BenchmarkTools
a = rand(1:35,100)
@everywhere function fib(n)
if n == 0 return 0 end
if n == 1 return 1 end
return fib(n-1) + fib(n-2)
endThe macro @everywhere make available the given function (or functions with @everywhere begin [shared function definitions] end or @everywhere include("sharedCode.jl")) to all the current workers.
result = map(fib,a)The pmap function ("parallel" map) automatically pick up the free processes, assign them the job prom the "input" array and merge the results in the returned array. Note that the order is preserved:
result2 = pmap(fib,a)
result == result2
@btime map(fib,$a) # serialised: median time: 514 ms 1 allocations
@btime pmap(fib,$a) # parallelised: median time: 265 ms 4220 allocations # the memory of `a` need to be copied to all processesDivide and Conquer
Rather than having a "heavy operation" and being interested in the individual results, here we have a "light" operation and we want to aggregate the results of the various computations using some aggreagation function. We can then use @distributed (aggregationfunction) for [forConditions] macro:
using Distributed, BenchmarkTools
function f(n) # our single-process benchmark
s = 0.0
for i = 1:n
s += i/2
end
return s
end
function pf(n)
s = @distributed (+) for i = 1:n # aggregate using sum on variable s
i/2 # the last element of the for cycle is used by the aggregator
end
return s
end
@btime f(10000000) # median time: 11.1 ms 0 allocations
@btime pf(10000000) # median time: 5.7 ms 145 allocationsNote that also in this case the improvement is less than proportional with the number of processes we add
Details on parallel comutation can be found on the official documentation, including information to run nativelly Julia on GPUs or TPUs.
This page was generated using Literate.jl.