SAXS Data Collection and Analysis Protocol

SAXS Data Collection and Analysis Protocol at APS 18-ID-D

 

PROTEIN PREPARATION METHODS

     Proteins purified over Ni affinity, heparin, Ion exchange, SEC columns etc.

     Protein purity ensured by a mandatory SEC step in the end of checked by DLS, SDS-PAGE etc.

     Protein concentration for SAXS "rule of thumb": Concentration ≥ 100/MW. i.e. 50 kDa protein, optimal concentration ≥ 100/50=2 mg/ml (*for SEC-SAXS, take dilution ratio (~3X) of SEC columns into consideration) - Typically prepare at least 500ul of > 5mg/ml.

     Post SDS-PAGE gels and chromatograms in your sample data-sheet for sample quality check and future reference.

 

SAXS COLLECTION METHOD

     SEC-MALS-SAXS / SEC-SAXS

     Buffer: 50 mM HEPES or Tris or phosphate buffer, pH ranging from 6-8 (if outside this range, consult beamline personnel for potential special measures), 150 mM NaCl (typical but up to 1M is permitted, can go higher but contrast will be affected), 1-2 % glycerol (higher concentrations can be done but column back pressures will increase), up to 5 mM DTT (or 1-2 mM TCEP)

     Inline Wyatt Technology WTC-030S5 (SN 0787), 0.8 ml/min elution rate (18 ml column)

     Columns for SEC-MALS-SAXS: Wyatt SEC column for MALS (ask Srinivas if you need specific column)

1.    010S5 100Å (MW range 100-100,000 Da)

2.    015S5 150Å (MW range 500-150,000 Da)

3.    030S5 300Å (MW range 5,000-1,250,000 Da)

·         Columns for SEC-SAXS: GE Healthcare SEC columns run at 0.75 ml/min elution rate (24ml - column volume) and discuss with beamline personnel to determine the most suitable column for your sample -

1.    Superdex 75 10/300 (MW < 75,000 Da)

2.    Superdex 200 Increase 10/300 (MW < 400,000 Da)

3.    Superose 6 10/300 (MW < 5,000,000 Da)

     Quartz capillary flow-cell

     lumen is 1.5 mm in diameter with a 10 um wall

     0.5-1 s exposure every 2-3 seconds

     X-ray wavelength is 1.033Å, 12KeV (this seldom changes)

     Data recorded on Pilatus3 1M (Dectris) detector at a sample-to-detector distance of ~3.5 m. This covers a momentum transfer range of ~ 0.005 Å-1 < q < 0.4 Å-1 (see raw *.dat file for determining this for your specific data)

     Data usually collected at room temperature (discuss more specific needs with beamline personnel).

     Normalized scattering data to the incident X-ray beam intensity

     Data are reduced to three column data q, I(q) and σI(q) via python script

 

 

COLLECTION WORKFLOW

     Switching buffer/equilibration

     SEC-MALS-SAXS setup needs long equilibration (6h to overnight), which precludes too many buffer exchanges during a single run.

     Split buffer in half so both inlet A and B can pump buffer (*Be prepared for enough buffer and bring them in two bottles in advance*). 2L is usually sufficient.

     Change flow rate by 0.1 ml/min about every minute (after pressure levels out) as you stop the flow of one buffer and begin the flow of a new buffer

     Once back up to 0.8 ml/min, equilibrate for at least 6 more hours

     If equilibrating overnight, ramp up to 0.8 ml/min and equilibrate at that rate overnight

     Make sure you clean the flow cell for MALS/DLS in between buffers, especially if there’s more glycerol etc. (anything that could change refractive index)

     ALSO:  keep glycerol concentrations as low as possible (preferably 5%<).

 

     Injecting sample and starting HPLC run

     Each run will take about 22 minutes. Have samples ready (*concentrate or dilute to appropriate concentration and spin down 10-15 minutes) so that near the end of one run you can: 1. clean the capillary

2. have the auto-sampler inject your sample just as the next run is about to start

     Each run should be about done after 18 min - stop collecting SAXS data then and you will have 5 min to clean the capillary and put the new sample in the sample tray

     If you think you will be running late, time can be extended

     To change time, right click in Quat. pump, select Method, change time

     Preparing your sample

     Samples are injected from vials. 250 - 350 ul will be injected, but fill vials to ~50 ul more than the injection volume (900 ul is the upper limit)

     Washing the system

     Wash the capillary

     Do not have the HPLC and wash pump connected to the capillary at the same time

     Connect capillary to hamilton syringe pump (on desktop) and press wash

     When done, capillary can be reconnected to the HPLC (red outlet 1 line)

     Make sure tube is inserted all the way (will feel resistance) and screw lines together

     BE CAREFUL MANOUVERING THE PEEK TUBES CONNECTED TO THE CAPILLARY AS THE CAPILLARY WALLS ARE 10 MICRON QUARTZ AND THEREFORE EXTREMELY FRAGILE.

Loading your sample

     At the beginning of each new sample set, set a program for that sample set

     In ASTRA (Wyatt's MALS software):

     Create new sequence file, type number of samples and name them

     Set the sample property 1) name, 2) choose method (MALS-dRI-SAXS), 3) set time (22 mins) for each sample

     In Chemstation (Agilent's HPLC software):

     Use the sample entry window to select positions in sample tray and name them

     Set the method SEC_constflow for each sample

     For each sample, put the vial in the proper position in the tray for that sample.

     Watch the autosampler pick up the vial, aspirate the sample, and replace the vial

     Once the vial has been replaced and the robot moves out of the way, remove the vial and check that a reasonable amount of sample has been aspirated to ensure proper functioning of the autosampler. Begin station search (make sure capillary has been cleaned and is hooked back up to HPLC)

     Data from HPLC runs are saved in pre-determined folder (usually date followed by PI last name).

 

COMPUTER WORKFLOW: SAXS

     On the computer called Rodin, setup the labview program that records the intensity of the incident and transmitted x-ray beams. Beamline personnel will give you detailed instructions for how to do this before the start of the experiment.

     On the medm** interface that controls the detector, update file name and change the file counter field (next file) to 1. Make sure the name you type here matches the one entered in the labview program above.

     **for the medm interface you have to click in box and leave mouse in box while typing. Must press enter before moving mouse. Make sure these fields are updated before you proceed with data acquisition!!!***

     Once these things are updated, can start collecting data. This will open and close the shutter exposing the sample to x-rays for 0.5 to 1 s periods every 2-3 seconds. Do make sure the shutter allowing beam into hutch D is open before commencing data acquisition.

     DATA PROCESSING

     On Data Reduction computer you will run three scripts **beamline scientist will assist you with the initial setup and will help you navigate to correct directory to start**:

     python reducePilatus1M.py param_"name".txt "filename"

   Reduces raw scattering images to a text file (.dat file, contains q, I(q) and σI(q)), normalizes scattering data to the incident X-ray beam intensity (meaning it uses a scaling factor and multiplies the raw data by that) on a frame by frame basis (which is why you have to set the start frame on the data collection computer to 1, and why the file names on the data collection computer and the blank computer have to be the same)

     python plotsum.py param_"name".txt "filename"

     plots total scattering intensity (total intensity vs frame number, this should mimic the UV peaks coming off the HPLC, but on a slight delay which can be calculated)

   python autosub.py param_"name".txt "filename" buffer-filename.dat. The buffer file is generated by averaging the I(q) vs q curves representing exposures flanking the elution peak.

     To do this, look at the plot of the total scattering intensity plot (plotted by the second script, above). Pick 10-20 frames in close proximity to the peak that are stable/"flat" before and after peak (better to pick from true baseline than to get a flat shoulder next to a peak), and average those (average 10-20 from before the peak, then average 10--20 after the peak. WRITE THE FRAME NUMBERS DOWN (**In case you need to reprocess later**). Then those averages are averaged. To average, we will use PRIMUS. First open the .dat files of the first 20 frames you are choosing to use by right clicking on the file and opening in PRIMUS. ***DO NOT SCALE THESE CURVES.*** Then hit "Average" in the user interface. This will generate an average curve, avrg001. This is automatically saved to the directory from which you opened the curves you selected. Now that you have avrg001, close all curves in primus and open your next set of 20 buffer curves. Repeat what you did before to generate avrg002 (it will automatically name the next average file avrg002). Now close all files in primus again, and only open avrg001 and avrg002. Now average these together to produce avrg003. We used the avrg003 file to buffer subtract in this script. You don't HAVE to use this one though, you may have to look at the results and decide by trial and error which buffer file is the most suitable for any given data set.

     Running this script will produce fully normalized/background subtracted/buffer subtracted data. You will pick which curves to average together to produce your final scaled and averaged scattering curve

     Upon selection of desired frames (from your peak), open into primus. Hit **SCALE!!!!*** and then hit AVERAGE in the primus user interface. This creates an avrg_001 file in the "subtract" folder. THIS avrg_001 FILE IN THE SUBTRACT FOLDER IS WHAT YOU WILL USE TO CONTINUE PROCESSING DATA. DO NOT CONFUSE IT WITH avrg_001 FILE THAT EXISTS IN HIGHER DIRECTORY- THIS avrg_001 CURVE IS AN AVERAGE OF 20 IMAGES TO BE USED TO SUBTRACT BACKGROUND

 

SAXS PROCESSING METHOD

     Buffer subtraction performed before processing using ATSAS PRIMUS

     20 frames in close proximity to the peak before and after peak are averaged (20 from before the peak are averaged (avrg001), then 20 after the peak are averaged (avrg002), then those averages are averaged (this is avrg003). avrg003 is generally used as buffer to be subtracted***** generation of average curves is explained in WORKFLOW below)

     PRIMUS to scale and average SAXS data

     Guinier approximation performed and Rg calculated by PRIMUS (*Linear Guinier plot indicates no aggregation or repulsion effects)

     P(r) plot calculated using autoGNOM

     P(r) used to estimate Rg, Dmax and Porod volume (which is used to estimate MW by SAXS), and to calculate ab initio models using DAMMIN or DAMMIF, which are then averaged with DAMAVER (DAMSEL, DAMSUP, DAMAVER, and DAMMFILT)

     P(r) used to calculate ab initio models, also using GASBOR (more reliable with higher quality high-q data (meaning > q = 0.25-0.3) and for non-globular proteins.

     Use scripts to run multiple independent GASBOR runs

     Check that the generated envelopes are unique

     Run DAMAVER to average the envelopes

     AMBIMETER score (<1.5 guarantees a unique ab initio shape determination)***It is part of ATSAS and a measurement of the ambiguity of your envelope***

     I-TASSER (or Phyre2 or Swiss-model) is used to build a homology model (if you don't have any structural information for your protein, which is then fit to the model using SUPCOMB)

     SymmDock can be used to generate models with P2 and P4 symmetry (based on Phyre2 and I-TASSER model) ***This symmetry stuff is necessary if you are trying to figure out what oligomeric state the protein is in. This may not be necessary for everyone***

     CRYSOL used to calculate theoretical scattering curves of P1 (starting model), P2, and P4 homology models and compare to experimental data (*do this before you dock your model to ab initio envelope*)

ATSAS PACKAGE DATA PROCESSING

     Open averaged, buffer subtracted curve (avrg001 in subtract (folder)) in PRIMUS

     Calculate Rg, make sure Guinier region is linear

     Generate P(r) and calculate Dmax, Porod volume - (include points out through at least q=0.15-0.2) Save the .out file (P(r) plot)

     If you decide to use the ATSAS package at a beamline computer, you can generate 10 DAMMIF models and average them, and refine with one more round of DAMMIN. (remember to save it before you click finish)

     If you have homology models, CRYSOL can be used to produce theoretical scattering curves. (CRYSOL uses .pdb file and .dat file (scattering curve))

     crysol ***.pdb ***.dat

     This will generate a .fit file containing a theoretical scattering curve of the pdb file fit to the experimental data (dat file) (the first three columns are q, Iexp(q) and Ifit(q)), and used for plotting)

     To generate envelopes, use DAMMIF and/or GASBOR (these use .out file- P(r)). Can run with different symmetries if desired.

     DAMMIF will average multiple runs on its own if accessed through primusqt

dammif ***.out -s P#

     GASBOR

     Use scripts to run multiple independent GASBOR runs

     Check that the generated envelopes are unique

     gasborp ***.out #residues -sy P#

     DAMAVER - to average the ab initio envelopes

     If averaging results of multiple DAMMIF runs

     damaver -a DAMMIF-*-1.pdb

     If averaging results of multiple GASBOR runs

     damaver -a GASBOR*.pdb

     CRYSOL can be used to back calculate scattering curves from your envelopes, but this isn't very meaningful and shouldn't be done.

     SUPCOMB will superimpose envelopes to homology models

     supcomb **.pdb ****.pdb

   Output file will be ****r.pdb

   **reference(homology) model

   ****moving (envelopes) model

 

COMPUTER WORKFLOW: MALS

     In ASTRA (Wyatt's software for SEC in-line MALS):

     Open an experiment (your file)

     Open Experiments tab on the bottom left

     Configuration -> generic pump (on left side panel)

     Set flow rate, (default is 0.8 but you should remember to change this to the real flow rate for any given experiment).

     Procedures -> baselines (on left)

     First click autofind baselines

     Click through baselines and make sure all baselines are defined (if they are not good you have to define them; click once at start point and, hold and drag it to right side to define the baseline and range)

     Click Apply

     In the top toolbar, Experiment -> configuration -> band broadening

     Define an area

     Click Perform fit

     Manually define peaks if needed

     Click Apply

     Click Peaks on left side panel

     Choose what peaks (UV curve) you will be working with (better to choose range above baseline)

     Click Apply

     Click Molecular Mass & Radius from LS on left side panel

     De-select detectors that had bad baselines

     Click Apply

     Can scroll through peaks to see MW for each peak from MALS

     Click Rh from QELS on left side panel

     Can scroll through peak and see Rh from QELS (DLS)