AGAL bytecode format

AGAL bytecode must use Endian.LITTLE_ENDIAN format.

Bytecode Header

AGAL bytecode must begin with a 7-byte header:

A0 01000000 A1 00 -- for a vertex program 
A0 01000000 A1 01 -- for a fragment program

Offset (bytes)

Size (bytes)

Name

Description

0

1

magic

must be 0xa0

1

4

version

must be 1

5

1

shader type ID

must be 0xa1

6

1

shader type

0 for a vertex program; 1 for a fragment program

Tokens

The header is immediately followed by any number of tokens. Every token is 192 bits (24 bytes) in size and always has the format:

[opcode][destination][source1][source2 or sampler]

Not every opcode uses all of these fields. Unused fields must be set to 0.

Operation codes

The [opcode] field is 32 bits in size and can take one of these values:

Name

Opcode

Operation

Description

mov

0x00

move

move data from source1 to destination, component-wise

add

0x01

add

destination = source1 + source2, component-wise

sub

0x02

subtract

destination = source1 - source2, component-wise

mul

0x03

multiply

destination = source1 * source2, component-wise

div

0x04

divide

destination = source1 / source2, component-wise

rcp

0x05

reciprocal

destination = 1/source1, component-wise

min

0x06

minimum

destination = minimum(source1,source2), component-wise

max

0x07

maximum

destination = maximum(source1,source2), component-wise

frc

0x08

fractional

destination = source1 - (float)floor(source1), component-wise

sqt

0x09

square root

destination = sqrt(source1), component-wise

rsq

0x0a

reciprocal root

destination = 1/sqrt(source1), component-wise

pow

0x0b

power

destination = pow(source1,source2), component-wise

log

0x0c

logarithm

destination = log_2(source1), component-wise

exp

0x0d

exponential

destination = 2^source1, component-wise

nrm

0x0e

normalize

destination = normalize(source1), component-wise (produces only a 3 component result, destination must be masked to .xyz or less)

sin

0x0f

sine

destination = sin(source1), component-wise

cos

0x10

cosine

destination = cos(source1), component-wise

crs

0x11

cross product

destination.x = source1.y * source2.z - source1.z * source2.y

destination.y = source1.z * source2.x - source1.x * source2.z

destination.z = source1.x * source2.y - source1.y * source2.x

(produces only a 3 component result, destination must be masked to .xyz or less)

dp3

0x12

dot product

destination = source1.x*source2.x + source1.y*source2.y + source1.z*source2.z

dp4

0x13

dot product

destination = source1.x*source2.x + source1.y*source2.y + source1.z*source2.z + source1.w*source2.w

abs

0x14

absolute

destination = abs(source1), component-wise

neg

0x15

negate

destination = -source1, component-wise

sat

0x16

saturate

destination = maximum(minimum(source1,1),0), component-wise

m33

0x17

multiply matrix 3x3

destination.x = (source1.x * source2[0].x) + (source1.y * source2[0].y) + (source1.z * source2[0].z)

destination.y = (source1.x * source2[1].x) + (source1.y * source2[1].y) + (source1.z * source2[1].z)

destination.z = (source1.x * source2[2].x) + (source1.y * source2[2].y) + (source1.z * source2[2].z)

(produces only a 3 component result, destination must be masked to .xyz or less)

m44

0x18

multiply matrix 4x4

destination.x = (source1.x * source2[0].x) + (source1.y * source2[0].y) + (source1.z * source2[0].z) + (source1.w * source2[0].w)

destination.y = (source1.x * source2[1].x) + (source1.y * source2[1].y) + (source1.z * source2[1].z) + (source1.w * source2[1].w)

destination.z = (source1.x * source2[2].x) + (source1.y * source2[2].y) + (source1.z * source2[2].z) + (source1.w * source2[2].w)

destination.w = (source1.x * source2[3].x) + (source1.y * source2[3].y) + (source1.z * source2[3].z) + (source1.w * source2[3].w)

m34

0x19

multiply matrix 3x4

destination.x = (source1.x * source2[0].x) + (source1.y * source2[0].y) + (source1.z * source2[0].z) + (source1.w * source2[0].w)

destination.y = (source1.x * source2[1].x) + (source1.y * source2[1].y) + (source1.z * source2[1].z) + (source1.w * source2[1].w)

destination.z = (source1.x * source2[2].x) + (source1.y * source2[2].y) + (source1.z * source2[2].z) + (source1.w * source2[2].w)

(produces only a 3 component result, destination must be masked to .xyz or less)

kil

0x27

kill/discard (fragment shader only)

If single scalar source component is less than zero, fragment is discarded and not drawn to the frame buffer. (Destination register must be set to all 0)

tex

0x28

texture sample (fragment shader only)

destination equals load from texture source2 at coordinates source1. In this case, source2 must be in sampler format.

sge

0x29

set-if-greater-equal

destination = source1 >= source2 ? 1 : 0, component-wise

slt

0x2a

set-if-less-than

destination = source1 < source2 ? 1 : 0, component-wise

seq

0x2c

set-if-equal

destination = source1 == source2 ? 1 : 0, component-wise

sne

0x2d

set-if-not-equal

destination = source1 != source2 ? 1 : 0, component-wise

Destination field format

The [destination] field is 32 bits in size:

31.............................0 
----TTTT----MMMMNNNNNNNNNNNNNNNN

T = Register type (4 bits)

M = Write mask (4 bits)

N = Register number (16 bits)

- = undefined, must be 0

Source field format

The [source] field is 64 bits in size:

63.............................................................0 
D-------------QQ----IIII----TTTTSSSSSSSSOOOOOOOONNNNNNNNNNNNNNNN

D = Direct=0/Indirect=1 for direct Q and I are ignored, 1bit

Q = Index register component select (2 bits)

I = Index register type (4 bits)

T = Register type (4 bits)

S = Swizzle (8 bits, 2 bits per component)

O = Indirect offset (8 bits)

N = Register number (16 bits)

- = undefined, must be 0

Sampler field format

The second source field for the tex opcode must be in [sampler] format, which is 64 bits in size:

63.............................................................0 
FFFFMMMMWWWWSSSSDDDD--------TTTT--------BBBBBBBBNNNNNNNNNNNNNNNN

N = Sampler register number (16 bits)

B = Texture level-of-detail (LOD) bias, signed integer, scale by 8. The floating point value used is b/8.0 (8 bits)

T = Register type, must be 5, Sampler (4 bits)

F = Filter (0=nearest,1=linear) (4 bits)

M = Mipmap (0=disable,1=nearest, 2=linear)

W = Wrapping (0=clamp,1=repeat)

S = Special flag bits (must be 0)

D = Dimension (0=2D, 1=Cube)

Program Registers

The following register types are defined. Use the value listed to specify a register type in the fields of a token:

Name

Value

Number per fragment program

Number per vertex program

Usage

Attribute

0

n/a

8

Vertex shader input; read from a vertex buffer specified using Context3D.setVertexBufferAt().

Constant

1

28

128

Shader input; set using the Context3D.setProgramConstants() family of functions.

Temporary

2

8

8

Temporary register for computation, not accessible outside program.

Output

3

1

1

Shader output: in a vertex program, the output is the clip space position; in a fragment program, the output is a color.

Varying

4

8

8

Transfer interpolated data between vertex and fragment shaders. The varying registers from the vertex program are applied as input to the fragment program. Values are interpolated according to the distance from the triangle vertices.

Sampler

5

8

n/a

Fragment shader input; read from a texture specified using Context3D.setTextureAt().