src/video/SDL_blit_A.c
author Ryan C. Gordon <icculus@icculus.org>
Sun, 08 Jan 2006 21:18:15 +0000
changeset 1240 3b8a43c428bb
parent 1175 867f521591e5
child 1312 c9b51268668f
permissions -rw-r--r--
From Bug #36: There are a couple of issues with the selection of Altivec alpha-blitting routines in CalculateAlphaBlit() in src/video/SDL_Blit_A.c. 1) There's no check for the presence of Altivec when checking if the Blit32to565PixelAlphaAltivec() routine can be selected. 2) Altivec cannot be used in video memory, and there's no check if the destination surface is a hardware surface. (Alpha-blitting to a hardware surface with GPU support is a bad idea, but somebody's bound to do it anyway.) Patch to fix these attached.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
     1
/*
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
     2
    SDL - Simple DirectMedia Layer
769
b8d311d90021 Updated copyright information for 2004 (Happy New Year!)
Sam Lantinga <slouken@libsdl.org>
parents: 739
diff changeset
     3
    Copyright (C) 1997-2004 Sam Lantinga
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
     4
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
     5
    This library is free software; you can redistribute it and/or
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
     6
    modify it under the terms of the GNU Library General Public
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
     7
    License as published by the Free Software Foundation; either
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
     8
    version 2 of the License, or (at your option) any later version.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
     9
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    10
    This library is distributed in the hope that it will be useful,
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    11
    but WITHOUT ANY WARRANTY; without even the implied warranty of
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    12
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    13
    Library General Public License for more details.
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    14
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    15
    You should have received a copy of the GNU Library General Public
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    16
    License along with this library; if not, write to the Free
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    17
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    18
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    19
    Sam Lantinga
252
e8157fcb3114 Updated the source with the correct e-mail address
Sam Lantinga <slouken@libsdl.org>
parents: 1
diff changeset
    20
    slouken@libsdl.org
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    21
*/
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    22
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    23
#ifdef SAVE_RCSID
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    24
static char rcsid =
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    25
 "@(#) $Id$";
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    26
#endif
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    27
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    28
#include <stdio.h>
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    29
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    30
#include "SDL_types.h"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    31
#include "SDL_video.h"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    32
#include "SDL_blit.h"
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    33
880
9ef41050100c Date: Tue, 30 Mar 2004 21:26:47 -0600
Sam Lantinga <slouken@libsdl.org>
parents: 769
diff changeset
    34
#if (defined(i386) || defined(__x86_64__)) && defined(__GNUC__) && defined(USE_ASMBLIT)
9ef41050100c Date: Tue, 30 Mar 2004 21:26:47 -0600
Sam Lantinga <slouken@libsdl.org>
parents: 769
diff changeset
    35
#define MMX_ASMBLIT
9ef41050100c Date: Tue, 30 Mar 2004 21:26:47 -0600
Sam Lantinga <slouken@libsdl.org>
parents: 769
diff changeset
    36
#endif
9ef41050100c Date: Tue, 30 Mar 2004 21:26:47 -0600
Sam Lantinga <slouken@libsdl.org>
parents: 769
diff changeset
    37
739
22dbf364c017 Added SDL_HasMMX(), SDL_Has3DNow(), SDL_HasSSE() in SDL_cpuinfo.h
Sam Lantinga <slouken@libsdl.org>
parents: 720
diff changeset
    38
/* Function to check the CPU flags */
22dbf364c017 Added SDL_HasMMX(), SDL_Has3DNow(), SDL_HasSSE() in SDL_cpuinfo.h
Sam Lantinga <slouken@libsdl.org>
parents: 720
diff changeset
    39
#include "SDL_cpuinfo.h"
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
    40
#ifdef MMX_ASMBLIT
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
    41
#include "mmx.h"
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
    42
#endif
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
    43
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    44
/* Functions to perform alpha blended blitting */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    45
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    46
/* N->1 blending with per-surface alpha */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    47
static void BlitNto1SurfaceAlpha(SDL_BlitInfo *info)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    48
{
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    49
	int width = info->d_width;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    50
	int height = info->d_height;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    51
	Uint8 *src = info->s_pixels;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    52
	int srcskip = info->s_skip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    53
	Uint8 *dst = info->d_pixels;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    54
	int dstskip = info->d_skip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    55
	Uint8 *palmap = info->table;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    56
	SDL_PixelFormat *srcfmt = info->src;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    57
	SDL_PixelFormat *dstfmt = info->dst;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    58
	int srcbpp = srcfmt->BytesPerPixel;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    59
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    60
	const unsigned A = srcfmt->alpha;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    61
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    62
	while ( height-- ) {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    63
	    DUFFS_LOOP4(
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    64
	    {
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
    65
		Uint32 Pixel;
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    66
		unsigned sR;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    67
		unsigned sG;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    68
		unsigned sB;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    69
		unsigned dR;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    70
		unsigned dG;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    71
		unsigned dB;
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
    72
		DISEMBLE_RGB(src, srcbpp, srcfmt, Pixel, sR, sG, sB);
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    73
		dR = dstfmt->palette->colors[*dst].r;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    74
		dG = dstfmt->palette->colors[*dst].g;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    75
		dB = dstfmt->palette->colors[*dst].b;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    76
		ALPHA_BLEND(sR, sG, sB, A, dR, dG, dB);
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    77
		dR &= 0xff;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    78
		dG &= 0xff;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    79
		dB &= 0xff;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    80
		/* Pack RGB into 8bit pixel */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    81
		if ( palmap == NULL ) {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    82
		    *dst =((dR>>5)<<(3+2))|
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    83
			  ((dG>>5)<<(2))|
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    84
			  ((dB>>6)<<(0));
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    85
		} else {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    86
		    *dst = palmap[((dR>>5)<<(3+2))|
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    87
				  ((dG>>5)<<(2))  |
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    88
				  ((dB>>6)<<(0))];
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    89
		}
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    90
		dst++;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    91
		src += srcbpp;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    92
	    },
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    93
	    width);
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    94
	    src += srcskip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    95
	    dst += dstskip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    96
	}
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    97
}
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    98
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
    99
/* N->1 blending with pixel alpha */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   100
static void BlitNto1PixelAlpha(SDL_BlitInfo *info)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   101
{
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   102
	int width = info->d_width;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   103
	int height = info->d_height;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   104
	Uint8 *src = info->s_pixels;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   105
	int srcskip = info->s_skip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   106
	Uint8 *dst = info->d_pixels;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   107
	int dstskip = info->d_skip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   108
	Uint8 *palmap = info->table;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   109
	SDL_PixelFormat *srcfmt = info->src;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   110
	SDL_PixelFormat *dstfmt = info->dst;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   111
	int srcbpp = srcfmt->BytesPerPixel;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   112
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   113
	/* FIXME: fix alpha bit field expansion here too? */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   114
	while ( height-- ) {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   115
	    DUFFS_LOOP4(
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   116
	    {
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   117
		Uint32 Pixel;
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   118
		unsigned sR;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   119
		unsigned sG;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   120
		unsigned sB;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   121
		unsigned sA;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   122
		unsigned dR;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   123
		unsigned dG;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   124
		unsigned dB;
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   125
		DISEMBLE_RGBA(src,srcbpp,srcfmt,Pixel,sR,sG,sB,sA);
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   126
		dR = dstfmt->palette->colors[*dst].r;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   127
		dG = dstfmt->palette->colors[*dst].g;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   128
		dB = dstfmt->palette->colors[*dst].b;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   129
		ALPHA_BLEND(sR, sG, sB, sA, dR, dG, dB);
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   130
		dR &= 0xff;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   131
		dG &= 0xff;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   132
		dB &= 0xff;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   133
		/* Pack RGB into 8bit pixel */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   134
		if ( palmap == NULL ) {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   135
		    *dst =((dR>>5)<<(3+2))|
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   136
			  ((dG>>5)<<(2))|
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   137
			  ((dB>>6)<<(0));
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   138
		} else {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   139
		    *dst = palmap[((dR>>5)<<(3+2))|
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   140
				  ((dG>>5)<<(2))  |
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   141
				  ((dB>>6)<<(0))  ];
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   142
		}
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   143
		dst++;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   144
		src += srcbpp;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   145
	    },
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   146
	    width);
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   147
	    src += srcskip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   148
	    dst += dstskip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   149
	}
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   150
}
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   151
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   152
/* colorkeyed N->1 blending with per-surface alpha */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   153
static void BlitNto1SurfaceAlphaKey(SDL_BlitInfo *info)
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   154
{
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   155
	int width = info->d_width;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   156
	int height = info->d_height;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   157
	Uint8 *src = info->s_pixels;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   158
	int srcskip = info->s_skip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   159
	Uint8 *dst = info->d_pixels;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   160
	int dstskip = info->d_skip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   161
	Uint8 *palmap = info->table;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   162
	SDL_PixelFormat *srcfmt = info->src;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   163
	SDL_PixelFormat *dstfmt = info->dst;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   164
	int srcbpp = srcfmt->BytesPerPixel;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   165
	Uint32 ckey = srcfmt->colorkey;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   166
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   167
	const int A = srcfmt->alpha;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   168
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   169
	while ( height-- ) {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   170
	    DUFFS_LOOP(
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   171
	    {
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   172
		Uint32 Pixel;
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   173
		unsigned sR;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   174
		unsigned sG;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   175
		unsigned sB;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   176
		unsigned dR;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   177
		unsigned dG;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   178
		unsigned dB;
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   179
		DISEMBLE_RGB(src, srcbpp, srcfmt, Pixel, sR, sG, sB);
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   180
		if ( Pixel != ckey ) {
0
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   181
		    dR = dstfmt->palette->colors[*dst].r;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   182
		    dG = dstfmt->palette->colors[*dst].g;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   183
		    dB = dstfmt->palette->colors[*dst].b;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   184
		    ALPHA_BLEND(sR, sG, sB, A, dR, dG, dB);
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   185
		    dR &= 0xff;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   186
		    dG &= 0xff;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   187
		    dB &= 0xff;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   188
		    /* Pack RGB into 8bit pixel */
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   189
		    if ( palmap == NULL ) {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   190
			*dst =((dR>>5)<<(3+2))|
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   191
			      ((dG>>5)<<(2)) |
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   192
			      ((dB>>6)<<(0));
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   193
		    } else {
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   194
			*dst = palmap[((dR>>5)<<(3+2))|
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   195
				      ((dG>>5)<<(2))  |
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   196
				      ((dB>>6)<<(0))  ];
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   197
		    }
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   198
		}
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   199
		dst++;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   200
		src += srcbpp;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   201
	    },
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   202
	    width);
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   203
	    src += srcskip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   204
	    dst += dstskip;
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   205
	}
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   206
}
74212992fb08 Initial revision
Sam Lantinga <slouken@lokigames.com>
parents:
diff changeset
   207
880
9ef41050100c Date: Tue, 30 Mar 2004 21:26:47 -0600
Sam Lantinga <slouken@libsdl.org>
parents: 769
diff changeset
   208
#ifdef MMX_ASMBLIT
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   209
/* fast RGB888->(A)RGB888 blending with surface alpha=128 special case */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   210
static void BlitRGBtoRGBSurfaceAlpha128MMX(SDL_BlitInfo *info)
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   211
{
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   212
	int width = info->d_width;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   213
	int height = info->d_height;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   214
	Uint32 *srcp = (Uint32 *)info->s_pixels;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   215
	int srcskip = info->s_skip >> 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   216
	Uint32 *dstp = (Uint32 *)info->d_pixels;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   217
	int dstskip = info->d_skip >> 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   218
        Uint8 load[8];
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   219
  
720
f90d80d68071 N Sep 17 8791 Sam Lantinga Re: tks source released
Sam Lantinga <slouken@libsdl.org>
parents: 689
diff changeset
   220
        *(Uint64 *)load = 0x00fefefe00fefefeULL;/* alpha128 mask */
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   221
        movq_m2r(*load, mm4); /* alpha128 mask -> mm4 */
720
f90d80d68071 N Sep 17 8791 Sam Lantinga Re: tks source released
Sam Lantinga <slouken@libsdl.org>
parents: 689
diff changeset
   222
        *(Uint64 *)load = 0x0001010100010101ULL;/* !alpha128 mask */
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   223
        movq_m2r(*load, mm3); /* !alpha128 mask -> mm3 */
720
f90d80d68071 N Sep 17 8791 Sam Lantinga Re: tks source released
Sam Lantinga <slouken@libsdl.org>
parents: 689
diff changeset
   224
        *(Uint64 *)load = 0xFF000000FF000000ULL;/* dst alpha mask */
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   225
        movq_m2r(*load, mm7); /* dst alpha mask -> mm7 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   226
	while(height--) {
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   227
            DUFFS_LOOP_DOUBLE2(
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   228
            {
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   229
		    Uint32 s = *srcp++;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   230
		    Uint32 d = *dstp;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   231
		    *dstp++ = ((((s & 0x00fefefe) + (d & 0x00fefefe)) >> 1)
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   232
			       + (s & d & 0x00010101)) | 0xff000000;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   233
            },{
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   234
	            movq_m2r((*dstp), mm2);/* 2 x dst -> mm2(ARGBARGB) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   235
	            movq_r2r(mm2, mm6); /* 2 x dst -> mm6(ARGBARGB) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   236
	      
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   237
	            movq_m2r((*srcp), mm1);/* 2 x src -> mm1(ARGBARGB) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   238
	            movq_r2r(mm1, mm5); /* 2 x src -> mm5(ARGBARGB) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   239
		
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   240
	            pand_r2r(mm4, mm6); /* dst & mask -> mm6 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   241
	            pand_r2r(mm4, mm5); /* src & mask -> mm5 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   242
	            paddd_r2r(mm6, mm5); /* mm6 + mm5 -> mm5 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   243
	            psrld_i2r(1, mm5); /* mm5 >> 1 -> mm5 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   244
	
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   245
	            pand_r2r(mm1, mm2); /* src & dst -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   246
	            pand_r2r(mm3, mm2); /* mm2 & !mask -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   247
	            paddd_r2r(mm5, mm2); /* mm5 + mm2 -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   248
	            por_r2r(mm7, mm2); /* mm7(full alpha) | mm2 -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   249
	            movq_r2m(mm2, (*dstp));/* mm2 -> 2 x dst pixels */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   250
	            dstp += 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   251
	            srcp += 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   252
            }, width);
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   253
	    srcp += srcskip;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   254
	    dstp += dstskip;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   255
	}
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   256
	emms();
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   257
}
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   258
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   259
/* fast RGB888->(A)RGB888 blending with surface alpha */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   260
static void BlitRGBtoRGBSurfaceAlphaMMX(SDL_BlitInfo *info)
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   261
{
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   262
	unsigned alpha = info->src->alpha;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   263
	if(alpha == 128) {
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   264
		BlitRGBtoRGBSurfaceAlpha128MMX(info);
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   265
	} else {
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   266
		int width = info->d_width;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   267
		int height = info->d_height;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   268
		Uint32 *srcp = (Uint32 *)info->s_pixels;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   269
		int srcskip = info->s_skip >> 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   270
		Uint32 *dstp = (Uint32 *)info->d_pixels;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   271
		int dstskip = info->d_skip >> 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   272
                Uint8 load[8] = {alpha, alpha, alpha, alpha,
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   273
    					alpha, alpha, alpha, alpha};
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   274
					
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   275
                movq_m2r(*load, mm4); /* alpha -> mm4 */
720
f90d80d68071 N Sep 17 8791 Sam Lantinga Re: tks source released
Sam Lantinga <slouken@libsdl.org>
parents: 689
diff changeset
   276
		*(Uint64 *)load = 0x00FF00FF00FF00FFULL;
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   277
                movq_m2r(*load, mm3); /* mask -> mm3 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   278
		pand_r2r(mm3, mm4); /* mm4 & mask -> 0A0A0A0A -> mm4 */
720
f90d80d68071 N Sep 17 8791 Sam Lantinga Re: tks source released
Sam Lantinga <slouken@libsdl.org>
parents: 689
diff changeset
   279
		*(Uint64 *)load = 0xFF000000FF000000ULL;/* dst alpha mask */
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   280
		movq_m2r(*load, mm7); /* dst alpha mask -> mm7 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   281
		
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   282
		while(height--) {
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   283
			DUFFS_LOOP_DOUBLE2({
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   284
				/* One Pixel Blend */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   285
	                        movd_m2r((*srcp), mm1);/* src(ARGB) -> mm1 (0000ARGB)*/
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   286
                                punpcklbw_r2r(mm1, mm1); /* AARRGGBB -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   287
                                pand_r2r(mm3, mm1); /* 0A0R0G0B -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   288
			  
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   289
	                        movd_m2r((*dstp), mm2);/* dst(ARGB) -> mm2 (0000ARGB)*/
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   290
			        movq_r2r(mm2, mm6);/* dst(ARGB) -> mm6 (0000ARGB)*/
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   291
                                punpcklbw_r2r(mm2, mm2); /* AARRGGBB -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   292
                                pand_r2r(mm3, mm2); /* 0A0R0G0B -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   293
			  
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   294
                                psubw_r2r(mm2, mm1);/* src - dst -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   295
	                        pmullw_r2r(mm4, mm1); /* mm1 * alpha -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   296
	                        psrlw_i2r(8, mm1); /* mm1 >> 8 -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   297
	                        paddw_r2r(mm1, mm2); /* mm1 + mm2(dst) -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   298
	                        pand_r2r(mm3, mm2); /* 0A0R0G0B -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   299
	                        packuswb_r2r(mm2, mm2);  /* ARGBARGB -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   300
	                        por_r2r(mm7, mm2); /* mm7(full alpha) | mm2 -> mm2 */
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   301
			        movd_r2m(mm2, *dstp);/* mm2 -> Pixel */
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   302
				++srcp;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   303
				++dstp;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   304
			},{
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   305
			        /* Two Pixels Blend */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   306
				movq_m2r((*srcp), mm0);/* 2 x src -> mm0(ARGBARGB)*/
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   307
			        movq_r2r(mm0, mm1); /* 2 x src -> mm1(ARGBARGB) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   308
                                punpcklbw_r2r(mm0, mm0); /* low - AARRGGBB -> mm0 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   309
			        pand_r2r(mm3, mm0); /* 0A0R0G0B -> mm0(src1) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   310
			        punpckhbw_r2r(mm1, mm1); /* high - AARRGGBB -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   311
	                        pand_r2r(mm3, mm1); /* 0A0R0G0B -> mm1(src2) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   312
	
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   313
	                        movq_m2r((*dstp), mm2);/* 2 x dst -> mm2(ARGBARGB) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   314
	                        movq_r2r(mm2, mm5); /* 2 x dst -> mm5(ARGBARGB) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   315
			        movq_r2r(mm2, mm6); /* 2 x dst -> mm6(ARGBARGB) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   316
                                punpcklbw_r2r(mm2, mm2); /* low - AARRGGBB -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   317
	                        punpckhbw_r2r(mm6, mm6); /* high - AARRGGBB -> mm6 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   318
                                pand_r2r(mm3, mm2); /* 0A0R0G0B -> mm2(dst1) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   319
	                  
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   320
                                psubw_r2r(mm2, mm0);/* src1 - dst1 -> mm0 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   321
	                        pmullw_r2r(mm4, mm0); /* mm0 * alpha -> mm0 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   322
			        pand_r2r(mm3, mm6); /* 0A0R0G0B -> mm6(dst2) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   323
			        psrlw_i2r(8, mm0); /* mm0 >> 8 -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   324
			        psubw_r2r(mm6, mm1);/* src2 - dst2 -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   325
	                        pmullw_r2r(mm4, mm1); /* mm1 * alpha -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   326
				paddw_r2r(mm0, mm2); /* mm0 + mm2(dst1) -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   327
	                        psrlw_i2r(8, mm1); /* mm1 >> 8 -> mm0 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   328
				pand_r2r(mm3, mm2); /* 0A0R0G0B -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   329
	                        paddw_r2r(mm1, mm6); /* mm1 + mm6(dst2) -> mm6 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   330
	                        pand_r2r(mm3, mm6); /* 0A0R0G0B -> mm6 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   331
	                        packuswb_r2r(mm2, mm2);  /* ARGBARGB -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   332
	                        packuswb_r2r(mm6, mm6);  /* ARGBARGB -> mm6 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   333
	                        psrlq_i2r(32, mm2); /* mm2 >> 32 -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   334
	                        psllq_i2r(32, mm6); /* mm6 << 32 -> mm6 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   335
	                        por_r2r(mm6, mm2); /* mm6 | mm2 -> mm2 */				
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   336
				por_r2r(mm7, mm2); /* mm7(full alpha) | mm2 -> mm2 */
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   337
                                movq_r2m(mm2, *dstp);/* mm2 -> 2 x Pixel */
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   338
				srcp += 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   339
				dstp += 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   340
			}, width);
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   341
			srcp += srcskip;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   342
			dstp += dstskip;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   343
		}
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   344
		emms();
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   345
	}
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   346
}
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   347
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   348
/* fast ARGB888->(A)RGB888 blending with pixel alpha */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   349
static void BlitRGBtoRGBPixelAlphaMMX(SDL_BlitInfo *info)
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   350
{
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   351
	int width = info->d_width;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   352
	int height = info->d_height;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   353
	Uint32 *srcp = (Uint32 *)info->s_pixels;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   354
	int srcskip = info->s_skip >> 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   355
	Uint32 *dstp = (Uint32 *)info->d_pixels;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   356
	int dstskip = info->d_skip >> 2;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   357
        Uint32 alpha = 0;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   358
        Uint8 load[8];
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   359
	                
720
f90d80d68071 N Sep 17 8791 Sam Lantinga Re: tks source released
Sam Lantinga <slouken@libsdl.org>
parents: 689
diff changeset
   360
	*(Uint64 *)load = 0x00FF00FF00FF00FFULL;
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   361
        movq_m2r(*load, mm3); /* mask -> mm2 */
720
f90d80d68071 N Sep 17 8791 Sam Lantinga Re: tks source released
Sam Lantinga <slouken@libsdl.org>
parents: 689
diff changeset
   362
	*(Uint64 *)load = 0x00FF000000000000ULL;
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   363
        movq_m2r(*load, mm7); /* dst alpha mask -> mm2 */
720
f90d80d68071 N Sep 17 8791 Sam Lantinga Re: tks source released
Sam Lantinga <slouken@libsdl.org>
parents: 689
diff changeset
   364
        *(Uint64 *)load = 0x00FFFFFF00FFFFFFULL;
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   365
        movq_m2r(*load, mm0); /* alpha 255 mask -> mm0 */
720
f90d80d68071 N Sep 17 8791 Sam Lantinga Re: tks source released
Sam Lantinga <slouken@libsdl.org>
parents: 689
diff changeset
   366
        *(Uint64 *)load = 0xFF000000FF000000ULL;
689
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   367
        movq_m2r(*load, mm6); /* alpha 255 !mask -> mm6 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   368
	while(height--) {
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   369
	    DUFFS_LOOP4({
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   370
	        alpha = *srcp;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   371
	        alpha >>= 24;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   372
		/* FIXME: Here we special-case opaque alpha since the
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   373
		   compositioning used (>>8 instead of /255) doesn't handle
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   374
		   it correctly. Also special-case alpha=0 for speed?
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   375
		   Benchmark this! */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   376
		if(alpha) {   
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   377
		  if(alpha == SDL_ALPHA_OPAQUE) {
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   378
		    movd_m2r((*srcp), mm1);/* src(ARGB) -> mm1 (0000ARGB)*/
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   379
		    movd_m2r((*dstp), mm2);/* dst(ARGB) -> mm2 (0000ARGB)*/
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   380
		    pand_r2r(mm0, mm1);
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   381
		    pand_r2r(mm6, mm2);
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   382
		    por_r2r(mm1, mm2);
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   383
		    movd_r2m(mm2, (*dstp));
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   384
		  } else {
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   385
		    movd_m2r((*srcp), mm1);/* src(ARGB) -> mm1 (0000ARGB)*/
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   386
                    punpcklbw_r2r(mm1, mm1); /* AARRGGBB -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   387
                    pand_r2r(mm3, mm1); /* 0A0R0G0B -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   388
			  
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   389
	            movd_m2r((*dstp), mm2);/* dst(ARGB) -> mm2 (0000ARGB)*/
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   390
                    punpcklbw_r2r(mm2, mm2); /* AARRGGBB -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   391
                    pand_r2r(mm3, mm2); /* 0A0R0G0B -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   392
		
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   393
		    movq_r2r(mm2, mm5);/* mm2(0A0R0G0B) -> mm5 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   394
		    pand_r2r(mm7, mm5); /* mm5 & dst alpha mask -> mm5(0A000000) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   395
		    psrlq_i2r(24, mm5); /* mm5 >> 24 -> mm5 (0000A000)*/
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   396
		    
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   397
		    movq_r2r(mm1, mm4);/* mm1(0A0R0G0B) -> mm4 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   398
		    psrlq_i2r(48, mm4); /* mm4 >> 48 -> mm4(0000000A) */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   399
		    punpcklwd_r2r(mm4, mm4); /* 00000A0A -> mm4 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   400
                    punpcklwd_r2r(mm4, mm4); /* 0A0A0A0A -> mm4 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   401
		                        		    
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   402
                    /* blend */		    
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   403
                    psubw_r2r(mm2, mm1);/* src - dst -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   404
	            pmullw_r2r(mm4, mm1); /* mm1 * alpha -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   405
	            psrlw_i2r(8, mm1); /* mm1 >> 8 -> mm1 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   406
	            paddw_r2r(mm1, mm2); /* mm1 + mm2(dst) -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   407
	            pand_r2r(mm3, mm2); /* 0A0R0G0B -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   408
		    packuswb_r2r(mm2, mm2);  /* ARGBARGB -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   409
		    pand_r2r(mm0, mm2); /* 0RGB0RGB -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   410
		    por_r2r(mm5, mm2); /* dst alpha | mm2 -> mm2 */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   411
		    movd_r2m(mm2, *dstp);/* mm2 -> dst */
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   412
		  }
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   413
		}
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   414
		++srcp;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   415
		++dstp;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   416
	    }, width);
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   417
	    srcp += srcskip;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   418
	    dstp += dstskip;
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   419
	}
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   420
	emms();
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   421
}
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   422
#endif
5bb080d35049 Date: Tue, 19 Aug 2003 17:57:00 +0200
Sam Lantinga <slouken@libsdl.org>
parents: 297
diff changeset
   423
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   424
#ifdef USE_ALTIVEC_BLITTERS
1175
867f521591e5 Fixed Altivec support on Mac OS X.
Ryan C. Gordon <icculus@icculus.org>
parents: 1162
diff changeset
   425
#ifdef HAVE_ALTIVEC_H
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   426
#include <altivec.h>
1175
867f521591e5 Fixed Altivec support on Mac OS X.
Ryan C. Gordon <icculus@icculus.org>
parents: 1162
diff changeset
   427
#endif
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   428
#include <assert.h>
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   429
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   430
#if ((defined MACOSX) && (__GNUC__ < 4))
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   431
    #define VECUINT8_LITERAL(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p) \
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   432
        (vector unsigned char) ( a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p )
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   433
    #define VECUINT16_LITERAL(a,b,c,d,e,f,g,h) \
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   434
        (vector unsigned short) ( a,b,c,d,e,f,g,h )
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   435
#else
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   436
    #define VECUINT8_LITERAL(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p) \
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   437
        (vector unsigned char) { a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p }
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   438
    #define VECUINT16_LITERAL(a,b,c,d,e,f,g,h) \
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   439
        (vector unsigned short) { a,b,c,d,e,f,g,h }
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   440
#endif
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   441
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   442
#define UNALIGNED_PTR(x) (((size_t) x) & 0x0000000F)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   443
#define VECPRINT(msg, v) do { \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   444
    vector unsigned int tmpvec = (vector unsigned int)(v); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   445
    unsigned int *vp = (unsigned int *)&tmpvec; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   446
    printf("%s = %08X %08X %08X %08X\n", msg, vp[0], vp[1], vp[2], vp[3]); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   447
} while (0)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   448
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   449
/* the permuation vector that takes the high bytes out of all the appropriate shorts 
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   450
    (vector unsigned char)(
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   451
        0x00, 0x10, 0x02, 0x12,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   452
        0x04, 0x14, 0x06, 0x16,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   453
        0x08, 0x18, 0x0A, 0x1A,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   454
        0x0C, 0x1C, 0x0E, 0x1E );
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   455
*/
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   456
#define VEC_MERGE_PERMUTE() (vec_add(vec_lvsl(0, (int*)NULL), (vector unsigned char)vec_splat_u16(0x0F)))
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   457
#define VEC_U32_24() (vec_add(vec_splat_u32(12), vec_splat_u32(12)))
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   458
#define VEC_ALPHA_MASK() ((vector unsigned char)vec_sl((vector unsigned int)vec_splat_s8(-1), VEC_U32_24()))
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   459
#define VEC_ALIGNER(src) ((UNALIGNED_PTR(src)) \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   460
    ? vec_lvsl(0, src) \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   461
    : vec_add(vec_lvsl(8, src), vec_splat_u8(8)))
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   462
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   463
   
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   464
#define VEC_MULTIPLY_ALPHA(vs, vd, valpha, mergePermute, v1_16, v8_16) do { \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   465
    /* vtemp1 contains source AAGGAAGGAAGGAAGG */ \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   466
    vector unsigned short vtemp1 = vec_mule(vs, valpha); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   467
    /* vtemp2 contains source RRBBRRBBRRBBRRBB */ \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   468
    vector unsigned short vtemp2 = vec_mulo(vs, valpha); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   469
    /* valpha2 is 255-alpha */ \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   470
    vector unsigned char valpha2 = vec_nor(valpha, valpha); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   471
    /* vtemp3 contains dest AAGGAAGGAAGGAAGG */ \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   472
    vector unsigned short vtemp3 = vec_mule(vd, valpha2); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   473
    /* vtemp4 contains dest RRBBRRBBRRBBRRBB */ \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   474
    vector unsigned short vtemp4 = vec_mulo(vd, valpha2); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   475
    /* add source and dest */ \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   476
    vtemp1 = vec_add(vtemp1, vtemp3); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   477
    vtemp2 = vec_add(vtemp2, vtemp4); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   478
    /* vtemp1 = (vtemp1 + 1) + ((vtemp1 + 1) >> 8) */ \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   479
    vtemp1 = vec_add(vtemp1, v1_16); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   480
    vtemp3 = vec_sr(vtemp1, v8_16); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   481
    vtemp1 = vec_add(vtemp1, vtemp3); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   482
    /* vtemp2 = (vtemp2 + 1) + ((vtemp2 + 1) >> 8) */ \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   483
    vtemp2 = vec_add(vtemp2, v1_16); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   484
    vtemp4 = vec_sr(vtemp2, v8_16); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   485
    vtemp2 = vec_add(vtemp2, vtemp4); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   486
    /* (>>8) and get ARGBARGBARGBARGB */ \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   487
    vd = (vector unsigned char)vec_perm(vtemp1, vtemp2, mergePermute); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   488
} while (0)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   489
 
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   490
/* Calculate the permute vector used for 32->32 swizzling */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   491
static vector unsigned char calc_swizzle32(const SDL_PixelFormat *srcfmt,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   492
                                  const SDL_PixelFormat *dstfmt)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   493
{
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   494
    /*
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   495
     * We have to assume that the bits that aren't used by other
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   496
     *  colors is alpha, and it's one complete byte, since some formats
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   497
     *  leave alpha with a zero mask, but we should still swizzle the bits.
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   498
     */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   499
    /* ARGB */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   500
    const static struct SDL_PixelFormat default_pixel_format = {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   501
        NULL, 0, 0,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   502
        0, 0, 0, 0,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   503
        16, 8, 0, 24,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   504
        0x00FF0000, 0x0000FF00, 0x000000FF, 0xFF000000,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   505
        0, 0};
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   506
    if (!srcfmt) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   507
        srcfmt = &default_pixel_format;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   508
    }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   509
    if (!dstfmt) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   510
        dstfmt = &default_pixel_format;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   511
    }
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   512
    vector unsigned char plus = VECUINT8_LITERAL
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   513
                                            ( 0x00, 0x00, 0x00, 0x00,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   514
                                              0x04, 0x04, 0x04, 0x04,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   515
                                              0x08, 0x08, 0x08, 0x08,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   516
                                              0x0C, 0x0C, 0x0C, 0x0C );
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   517
    vector unsigned char vswiz;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   518
    vector unsigned int srcvec;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   519
#define RESHIFT(X) (3 - ((X) >> 3))
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   520
    Uint32 rmask = RESHIFT(srcfmt->Rshift) << (dstfmt->Rshift);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   521
    Uint32 gmask = RESHIFT(srcfmt->Gshift) << (dstfmt->Gshift);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   522
    Uint32 bmask = RESHIFT(srcfmt->Bshift) << (dstfmt->Bshift);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   523
    Uint32 amask;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   524
    /* Use zero for alpha if either surface doesn't have alpha */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   525
    if (dstfmt->Amask) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   526
        amask = ((srcfmt->Amask) ? RESHIFT(srcfmt->Ashift) : 0x10) << (dstfmt->Ashift);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   527
    } else {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   528
        amask = 0x10101010 & ((dstfmt->Rmask | dstfmt->Gmask | dstfmt->Bmask) ^ 0xFFFFFFFF);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   529
    }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   530
#undef RESHIFT  
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   531
    ((unsigned int *)(char*)&srcvec)[0] = (rmask | gmask | bmask | amask);
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   532
    vswiz = vec_add(plus, (vector unsigned char)vec_splat(srcvec, 0));
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   533
    return(vswiz);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   534
}
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   535
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   536
static void Blit32to565PixelAlphaAltivec(SDL_BlitInfo *info)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   537
{
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   538
    int height = info->d_height;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   539
    Uint8 *src = (Uint8 *)info->s_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   540
    int srcskip = info->s_skip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   541
    Uint8 *dst = (Uint8 *)info->d_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   542
    int dstskip = info->d_skip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   543
    SDL_PixelFormat *srcfmt = info->src;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   544
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   545
    vector unsigned char v0 = vec_splat_u8(0);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   546
    vector unsigned short v8_16 = vec_splat_u16(8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   547
    vector unsigned short v1_16 = vec_splat_u16(1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   548
    vector unsigned short v2_16 = vec_splat_u16(2);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   549
    vector unsigned short v3_16 = vec_splat_u16(3);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   550
    vector unsigned int v8_32 = vec_splat_u32(8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   551
    vector unsigned int v16_32 = vec_add(v8_32, v8_32);
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   552
    vector unsigned short v3f = VECUINT16_LITERAL(
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   553
        0x003f, 0x003f, 0x003f, 0x003f,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   554
        0x003f, 0x003f, 0x003f, 0x003f);
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   555
    vector unsigned short vfc = VECUINT16_LITERAL(
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   556
        0x00fc, 0x00fc, 0x00fc, 0x00fc,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   557
        0x00fc, 0x00fc, 0x00fc, 0x00fc);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   558
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   559
    /* 
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   560
        0x10 - 0x1f is the alpha
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   561
        0x00 - 0x0e evens are the red
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   562
        0x01 - 0x0f odds are zero
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   563
    */
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   564
    vector unsigned char vredalpha1 = VECUINT8_LITERAL(
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   565
        0x10, 0x00, 0x01, 0x01,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   566
        0x10, 0x02, 0x01, 0x01,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   567
        0x10, 0x04, 0x01, 0x01,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   568
        0x10, 0x06, 0x01, 0x01
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   569
    );
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   570
    vector unsigned char vredalpha2 = (vector unsigned char)(
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   571
        vec_add((vector unsigned int)vredalpha1, vec_sl(v8_32, v16_32))
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   572
    );
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   573
    /*
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   574
        0x00 - 0x0f is ARxx ARxx ARxx ARxx
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   575
        0x11 - 0x0f odds are blue
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   576
    */
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   577
    vector unsigned char vblue1 = VECUINT8_LITERAL(
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   578
        0x00, 0x01, 0x02, 0x11,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   579
        0x04, 0x05, 0x06, 0x13,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   580
        0x08, 0x09, 0x0a, 0x15,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   581
        0x0c, 0x0d, 0x0e, 0x17
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   582
    );
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   583
    vector unsigned char vblue2 = (vector unsigned char)(
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   584
        vec_add((vector unsigned int)vblue1, v8_32)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   585
    );
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   586
    /*
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   587
        0x00 - 0x0f is ARxB ARxB ARxB ARxB
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   588
        0x10 - 0x0e evens are green
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   589
    */
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   590
    vector unsigned char vgreen1 = VECUINT8_LITERAL(
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   591
        0x00, 0x01, 0x10, 0x03,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   592
        0x04, 0x05, 0x12, 0x07,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   593
        0x08, 0x09, 0x14, 0x0b,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   594
        0x0c, 0x0d, 0x16, 0x0f
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   595
    );
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   596
    vector unsigned char vgreen2 = (vector unsigned char)(
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   597
        vec_add((vector unsigned int)vgreen1, vec_sl(v8_32, v8_32))
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   598
    );
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   599
    vector unsigned char vgmerge = VECUINT8_LITERAL(
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   600
        0x00, 0x02, 0x00, 0x06,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   601
        0x00, 0x0a, 0x00, 0x0e,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   602
        0x00, 0x12, 0x00, 0x16,
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   603
        0x00, 0x1a, 0x00, 0x1e);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   604
    vector unsigned char mergePermute = VEC_MERGE_PERMUTE();
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   605
    vector unsigned char vpermute = calc_swizzle32(srcfmt, NULL);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   606
    vector unsigned char valphaPermute = vec_and(vec_lvsl(0, (int *)NULL), vec_splat_u8(0xC));
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   607
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   608
    vector unsigned short vf800 = (vector unsigned short)vec_splat_u8(-7);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   609
    vf800 = vec_sl(vf800, vec_splat_u16(8));
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   610
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   611
    while(height--) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   612
        int extrawidth;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   613
        vector unsigned char valigner;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   614
        vector unsigned char vsrc;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   615
        vector unsigned char voverflow;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   616
        int width = info->d_width;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   617
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   618
#define ONE_PIXEL_BLEND(condition, widthvar) \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   619
        while (condition) { \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   620
            Uint32 Pixel; \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   621
            unsigned sR, sG, sB, dR, dG, dB, sA; \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   622
            DISEMBLE_RGBA(src, 4, srcfmt, Pixel, sR, sG, sB, sA); \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   623
            if(sA) { \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   624
                unsigned short dstpixel = *((unsigned short *)dst); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   625
                dR = (dstpixel >> 8) & 0xf8; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   626
                dG = (dstpixel >> 3) & 0xfc; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   627
                dB = (dstpixel << 3) & 0xf8; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   628
                ACCURATE_ALPHA_BLEND(sR, sG, sB, sA, dR, dG, dB); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   629
                *((unsigned short *)dst) = ( \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   630
                    ((dR & 0xf8) << 8) | ((dG & 0xfc) << 3) | (dB >> 3) \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   631
                ); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   632
            } \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   633
            src += 4; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   634
            dst += 2; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   635
            widthvar--; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   636
        }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   637
        ONE_PIXEL_BLEND((UNALIGNED_PTR(dst)) && (width), width);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   638
        extrawidth = (width % 8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   639
        valigner = VEC_ALIGNER(src);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   640
        vsrc = (vector unsigned char)vec_ld(0, src);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   641
        width -= extrawidth;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   642
        while (width) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   643
            vector unsigned char valpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   644
            vector unsigned char vsrc1, vsrc2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   645
            vector unsigned char vdst1, vdst2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   646
            vector unsigned short vR, vG, vB;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   647
            vector unsigned short vpixel, vrpixel, vgpixel, vbpixel;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   648
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   649
            /* Load 8 pixels from src as ARGB */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   650
            voverflow = (vector unsigned char)vec_ld(15, src);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   651
            vsrc = vec_perm(vsrc, voverflow, valigner);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   652
            vsrc1 = vec_perm(vsrc, vsrc, vpermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   653
            src += 16;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   654
            vsrc = (vector unsigned char)vec_ld(15, src);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   655
            voverflow = vec_perm(voverflow, vsrc, valigner);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   656
            vsrc2 = vec_perm(voverflow, voverflow, vpermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   657
            src += 16;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   658
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   659
            /* Load 8 pixels from dst as XRGB */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   660
            voverflow = vec_ld(0, dst);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   661
            vR = vec_and((vector unsigned short)voverflow, vf800);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   662
            vB = vec_sl((vector unsigned short)voverflow, v3_16);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   663
            vG = vec_sl(vB, v2_16);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   664
            vdst1 = (vector unsigned char)vec_perm((vector unsigned char)vR, (vector unsigned char)vR, vredalpha1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   665
            vdst1 = vec_perm(vdst1, (vector unsigned char)vB, vblue1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   666
            vdst1 = vec_perm(vdst1, (vector unsigned char)vG, vgreen1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   667
            vdst2 = (vector unsigned char)vec_perm((vector unsigned char)vR, (vector unsigned char)vR, vredalpha2);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   668
            vdst2 = vec_perm(vdst2, (vector unsigned char)vB, vblue2);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   669
            vdst2 = vec_perm(vdst2, (vector unsigned char)vG, vgreen2);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   670
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   671
            /* Alpha blend 8 pixels as ARGB */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   672
            valpha = vec_perm(vsrc1, v0, valphaPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   673
            VEC_MULTIPLY_ALPHA(vsrc1, vdst1, valpha, mergePermute, v1_16, v8_16);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   674
            valpha = vec_perm(vsrc2, v0, valphaPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   675
            VEC_MULTIPLY_ALPHA(vsrc2, vdst2, valpha, mergePermute, v1_16, v8_16);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   676
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   677
            /* Convert 8 pixels to 565 */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   678
            vpixel = (vector unsigned short)vec_packpx((vector unsigned int)vdst1, (vector unsigned int)vdst2);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   679
            vgpixel = (vector unsigned short)vec_perm(vdst1, vdst2, vgmerge);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   680
            vgpixel = vec_and(vgpixel, vfc);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   681
            vgpixel = vec_sl(vgpixel, v3_16);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   682
            vrpixel = vec_sl(vpixel, v1_16);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   683
            vrpixel = vec_and(vrpixel, vf800);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   684
            vbpixel = vec_and(vpixel, v3f);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   685
            vdst1 = vec_or((vector unsigned char)vrpixel, (vector unsigned char)vgpixel);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   686
            vdst1 = vec_or(vdst1, (vector unsigned char)vbpixel);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   687
            
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   688
            /* Store 8 pixels */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   689
            vec_st(vdst1, 0, dst);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   690
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   691
            width -= 8;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   692
            dst += 16;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   693
        }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   694
        ONE_PIXEL_BLEND((extrawidth), extrawidth);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   695
#undef ONE_PIXEL_BLEND
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   696
        src += srcskip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   697
        dst += dstskip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   698
    }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   699
}
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   700
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   701
static void Blit32to32SurfaceAlphaKeyAltivec(SDL_BlitInfo *info)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   702
{
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   703
    unsigned alpha = info->src->alpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   704
    int height = info->d_height;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   705
    Uint32 *srcp = (Uint32 *)info->s_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   706
    int srcskip = info->s_skip >> 2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   707
    Uint32 *dstp = (Uint32 *)info->d_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   708
    int dstskip = info->d_skip >> 2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   709
    SDL_PixelFormat *srcfmt = info->src;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   710
    SDL_PixelFormat *dstfmt = info->dst;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   711
    unsigned sA = srcfmt->alpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   712
    unsigned dA = dstfmt->Amask ? SDL_ALPHA_OPAQUE : 0;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   713
    Uint32 rgbmask = srcfmt->Rmask | srcfmt->Gmask | srcfmt->Bmask;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   714
    Uint32 ckey = info->src->colorkey;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   715
    vector unsigned char mergePermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   716
    vector unsigned char vsrcPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   717
    vector unsigned char vdstPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   718
    vector unsigned char vsdstPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   719
    vector unsigned char valpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   720
    vector unsigned char valphamask;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   721
    vector unsigned char vbits;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   722
    vector unsigned char v0;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   723
    vector unsigned short v1;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   724
    vector unsigned short v8;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   725
    vector unsigned int vckey;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   726
    vector unsigned int vrgbmask;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   727
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   728
    mergePermute = VEC_MERGE_PERMUTE();
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   729
    v0 = vec_splat_u8(0);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   730
    v1 = vec_splat_u16(1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   731
    v8 = vec_splat_u16(8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   732
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   733
    /* set the alpha to 255 on the destination surf */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   734
    valphamask = VEC_ALPHA_MASK();
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   735
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   736
    vsrcPermute = calc_swizzle32(srcfmt, NULL);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   737
    vdstPermute = calc_swizzle32(NULL, dstfmt);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   738
    vsdstPermute = calc_swizzle32(dstfmt, NULL);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   739
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   740
    /* set a vector full of alpha and 255-alpha */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   741
    ((unsigned char *)&valpha)[0] = alpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   742
    valpha = vec_splat(valpha, 0);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   743
    vbits = (vector unsigned char)vec_splat_s8(-1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   744
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   745
    ckey &= rgbmask;
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   746
    ((unsigned int *)(char*)&vckey)[0] = ckey;
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   747
    vckey = vec_splat(vckey, 0);
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   748
    ((unsigned int *)(char*)&vrgbmask)[0] = rgbmask;
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   749
    vrgbmask = vec_splat(vrgbmask, 0);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   750
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   751
    while(height--) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   752
        int width = info->d_width;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   753
#define ONE_PIXEL_BLEND(condition, widthvar) \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   754
        while (condition) { \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   755
            Uint32 Pixel; \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   756
            unsigned sR, sG, sB, dR, dG, dB; \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   757
            RETRIEVE_RGB_PIXEL(((Uint8 *)srcp), 4, Pixel); \
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   758
            if(sA && Pixel != ckey) { \
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   759
                RGB_FROM_PIXEL(Pixel, srcfmt, sR, sG, sB); \
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   760
                DISEMBLE_RGB(((Uint8 *)dstp), 4, dstfmt, Pixel, dR, dG, dB); \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   761
                ACCURATE_ALPHA_BLEND(sR, sG, sB, sA, dR, dG, dB); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   762
                ASSEMBLE_RGBA(((Uint8 *)dstp), 4, dstfmt, dR, dG, dB, dA); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   763
            } \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   764
            dstp++; \
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   765
            srcp++; \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   766
            widthvar--; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   767
        }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   768
        ONE_PIXEL_BLEND((UNALIGNED_PTR(dstp)) && (width), width);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   769
        if (width > 0) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   770
            int extrawidth = (width % 4);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   771
            vector unsigned char valigner = VEC_ALIGNER(srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   772
            vector unsigned char vs = (vector unsigned char)vec_ld(0, srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   773
            width -= extrawidth;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   774
            while (width) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   775
                vector unsigned char vsel;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   776
                vector unsigned char voverflow;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   777
                vector unsigned char vd;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   778
                vector unsigned char vd_orig;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   779
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   780
                /* s = *srcp */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   781
                voverflow = (vector unsigned char)vec_ld(15, srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   782
                vs = vec_perm(vs, voverflow, valigner);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   783
                
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   784
                /* vsel is set for items that match the key */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   785
                vsel = (vector unsigned char)vec_and((vector unsigned int)vs, vrgbmask);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   786
                vsel = (vector unsigned char)vec_cmpeq((vector unsigned int)vsel, vckey);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   787
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   788
                /* permute to source format */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   789
                vs = vec_perm(vs, valpha, vsrcPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   790
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   791
                /* d = *dstp */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   792
                vd = (vector unsigned char)vec_ld(0, dstp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   793
                vd_orig = vd = vec_perm(vd, v0, vsdstPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   794
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   795
                VEC_MULTIPLY_ALPHA(vs, vd, valpha, mergePermute, v1, v8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   796
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   797
                /* set the alpha channel to full on */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   798
                vd = vec_or(vd, valphamask);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   799
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   800
                /* mask out color key */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   801
                vd = vec_sel(vd, vd_orig, vsel);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   802
                
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   803
                /* permute to dest format */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   804
                vd = vec_perm(vd, vbits, vdstPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   805
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   806
                /* *dstp = res */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   807
                vec_st((vector unsigned int)vd, 0, dstp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   808
                
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   809
                srcp += 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   810
                dstp += 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   811
                width -= 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   812
                vs = voverflow;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   813
            }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   814
            ONE_PIXEL_BLEND((extrawidth), extrawidth);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   815
        }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   816
#undef ONE_PIXEL_BLEND
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   817
 
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   818
        srcp += srcskip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   819
        dstp += dstskip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   820
    }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   821
}
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   822
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   823
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   824
static void Blit32to32PixelAlphaAltivec(SDL_BlitInfo *info)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   825
{
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   826
    int width = info->d_width;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   827
    int height = info->d_height;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   828
    Uint32 *srcp = (Uint32 *)info->s_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   829
    int srcskip = info->s_skip >> 2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   830
    Uint32 *dstp = (Uint32 *)info->d_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   831
    int dstskip = info->d_skip >> 2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   832
    SDL_PixelFormat *srcfmt = info->src;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   833
    SDL_PixelFormat *dstfmt = info->dst;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   834
    vector unsigned char mergePermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   835
    vector unsigned char valphaPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   836
    vector unsigned char vsrcPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   837
    vector unsigned char vdstPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   838
    vector unsigned char vsdstPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   839
    vector unsigned char valphamask;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   840
    vector unsigned char vpixelmask;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   841
    vector unsigned char v0;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   842
    vector unsigned short v1;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   843
    vector unsigned short v8;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   844
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   845
    v0 = vec_splat_u8(0);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   846
    v1 = vec_splat_u16(1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   847
    v8 = vec_splat_u16(8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   848
    mergePermute = VEC_MERGE_PERMUTE();
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   849
    valphamask = VEC_ALPHA_MASK();
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   850
    valphaPermute = vec_and(vec_lvsl(0, (int *)NULL), vec_splat_u8(0xC));
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   851
    vpixelmask = vec_nor(valphamask, v0);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   852
    vsrcPermute = calc_swizzle32(srcfmt, NULL);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   853
    vdstPermute = calc_swizzle32(NULL, dstfmt);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   854
    vsdstPermute = calc_swizzle32(dstfmt, NULL);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   855
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   856
	while ( height-- ) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   857
        width = info->d_width;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   858
#define ONE_PIXEL_BLEND(condition, widthvar) while ((condition)) { \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   859
            Uint32 Pixel; \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   860
            unsigned sR, sG, sB, dR, dG, dB, sA, dA; \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   861
            DISEMBLE_RGBA((Uint8 *)srcp, 4, srcfmt, Pixel, sR, sG, sB, sA); \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   862
            if(sA) { \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
   863
              DISEMBLE_RGBA((Uint8 *)dstp, 4, dstfmt, Pixel, dR, dG, dB, dA); \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   864
              ACCURATE_ALPHA_BLEND(sR, sG, sB, sA, dR, dG, dB); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   865
              ASSEMBLE_RGBA((Uint8 *)dstp, 4, dstfmt, dR, dG, dB, dA); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   866
            } \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   867
            ++srcp; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   868
            ++dstp; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   869
            widthvar--; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   870
        }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   871
        ONE_PIXEL_BLEND((UNALIGNED_PTR(dstp)) && (width), width);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   872
        if (width > 0) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   873
            // vsrcPermute
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   874
            // vdstPermute
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   875
            int extrawidth = (width % 4);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   876
            vector unsigned char valigner = VEC_ALIGNER(srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   877
            vector unsigned char vs = (vector unsigned char)vec_ld(0, srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   878
            width -= extrawidth;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   879
            while (width) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   880
                vector unsigned char voverflow;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   881
                vector unsigned char vd;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   882
                vector unsigned char valpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   883
                vector unsigned char vdstalpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   884
                /* s = *srcp */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   885
                voverflow = (vector unsigned char)vec_ld(15, srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   886
                vs = vec_perm(vs, voverflow, valigner);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   887
                vs = vec_perm(vs, v0, vsrcPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   888
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   889
                valpha = vec_perm(vs, v0, valphaPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   890
                
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   891
                /* d = *dstp */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   892
                vd = (vector unsigned char)vec_ld(0, dstp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   893
                vd = vec_perm(vd, v0, vsdstPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   894
                vdstalpha = vec_and(vd, valphamask);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   895
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   896
                VEC_MULTIPLY_ALPHA(vs, vd, valpha, mergePermute, v1, v8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   897
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   898
                /* set the alpha to the dest alpha */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   899
                vd = vec_and(vd, vpixelmask);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   900
                vd = vec_or(vd, vdstalpha);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   901
                vd = vec_perm(vd, v0, vdstPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   902
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   903
                /* *dstp = res */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   904
                vec_st((vector unsigned int)vd, 0, dstp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   905
                
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   906
                srcp += 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   907
                dstp += 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   908
                width -= 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   909
                vs = voverflow;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   910
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   911
            }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   912
            ONE_PIXEL_BLEND((extrawidth), extrawidth);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   913
        }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   914
	    srcp += srcskip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   915
	    dstp += dstskip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   916
#undef ONE_PIXEL_BLEND
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   917
	}
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   918
}
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   919
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   920
/* fast ARGB888->(A)RGB888 blending with pixel alpha */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   921
static void BlitRGBtoRGBPixelAlphaAltivec(SDL_BlitInfo *info)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   922
{
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   923
	int width = info->d_width;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   924
	int height = info->d_height;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   925
	Uint32 *srcp = (Uint32 *)info->s_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   926
	int srcskip = info->s_skip >> 2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   927
	Uint32 *dstp = (Uint32 *)info->d_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   928
	int dstskip = info->d_skip >> 2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   929
    vector unsigned char mergePermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   930
    vector unsigned char valphaPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   931
    vector unsigned char valphamask;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   932
    vector unsigned char vpixelmask;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   933
    vector unsigned char v0;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   934
    vector unsigned short v1;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   935
    vector unsigned short v8;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   936
    v0 = vec_splat_u8(0);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   937
    v1 = vec_splat_u16(1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   938
    v8 = vec_splat_u16(8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   939
    mergePermute = VEC_MERGE_PERMUTE();
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   940
    valphamask = VEC_ALPHA_MASK();
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   941
    valphaPermute = vec_and(vec_lvsl(0, (int *)NULL), vec_splat_u8(0xC));
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   942
    
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   943
 
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   944
    vpixelmask = vec_nor(valphamask, v0);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   945
	while(height--) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   946
        width = info->d_width;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   947
#define ONE_PIXEL_BLEND(condition, widthvar) \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   948
        while ((condition)) { \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   949
            Uint32 dalpha; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   950
            Uint32 d; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   951
            Uint32 s1; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   952
            Uint32 d1; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   953
            Uint32 s = *srcp; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   954
            Uint32 alpha = s >> 24; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   955
            if(alpha) { \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   956
              if(alpha == SDL_ALPHA_OPAQUE) { \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   957
                *dstp = (s & 0x00ffffff) | (*dstp & 0xff000000); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   958
              } else { \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   959
                d = *dstp; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   960
                dalpha = d & 0xff000000; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   961
                s1 = s & 0xff00ff; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   962
                d1 = d & 0xff00ff; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   963
                d1 = (d1 + ((s1 - d1) * alpha >> 8)) & 0xff00ff; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   964
                s &= 0xff00; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   965
                d &= 0xff00; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   966
                d = (d + ((s - d) * alpha >> 8)) & 0xff00; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   967
                *dstp = d1 | d | dalpha; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   968
              } \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   969
            } \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   970
            ++srcp; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   971
            ++dstp; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   972
            widthvar--; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   973
	    }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   974
        ONE_PIXEL_BLEND((UNALIGNED_PTR(dstp)) && (width), width);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   975
        if (width > 0) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   976
            int extrawidth = (width % 4);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   977
            vector unsigned char valigner = VEC_ALIGNER(srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   978
            vector unsigned char vs = (vector unsigned char)vec_ld(0, srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   979
            width -= extrawidth;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   980
            while (width) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   981
                vector unsigned char voverflow;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   982
                vector unsigned char vd;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   983
                vector unsigned char valpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   984
                vector unsigned char vdstalpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   985
                /* s = *srcp */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   986
                voverflow = (vector unsigned char)vec_ld(15, srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   987
                vs = vec_perm(vs, voverflow, valigner);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   988
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   989
                valpha = vec_perm(vs, v0, valphaPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   990
                
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   991
                /* d = *dstp */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   992
                vd = (vector unsigned char)vec_ld(0, dstp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   993
                vdstalpha = vec_and(vd, valphamask);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   994
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   995
                VEC_MULTIPLY_ALPHA(vs, vd, valpha, mergePermute, v1, v8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   996
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   997
                /* set the alpha to the dest alpha */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   998
                vd = vec_and(vd, vpixelmask);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
   999
                vd = vec_or(vd, vdstalpha);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1000
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1001
                /* *dstp = res */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1002
                vec_st((vector unsigned int)vd, 0, dstp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1003
                
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1004
                srcp += 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1005
                dstp += 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1006
                width -= 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1007
                vs = voverflow;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1008
            }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1009
            ONE_PIXEL_BLEND((extrawidth), extrawidth);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1010
        }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1011
	    srcp += srcskip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1012
	    dstp += dstskip;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1013
	}
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1014
#undef ONE_PIXEL_BLEND
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1015
}
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1016
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1017
static void Blit32to32SurfaceAlphaAltivec(SDL_BlitInfo *info)
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1018
{
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1019
    /* XXX : 6 */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1020
	unsigned alpha = info->src->alpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1021
    int height = info->d_height;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1022
    Uint32 *srcp = (Uint32 *)info->s_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1023
    int srcskip = info->s_skip >> 2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1024
    Uint32 *dstp = (Uint32 *)info->d_pixels;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1025
    int dstskip = info->d_skip >> 2;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1026
    SDL_PixelFormat *srcfmt = info->src;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1027
    SDL_PixelFormat *dstfmt = info->dst;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1028
	unsigned sA = srcfmt->alpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1029
	unsigned dA = dstfmt->Amask ? SDL_ALPHA_OPAQUE : 0;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1030
    vector unsigned char mergePermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1031
    vector unsigned char vsrcPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1032
    vector unsigned char vdstPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1033
    vector unsigned char vsdstPermute;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1034
    vector unsigned char valpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1035
    vector unsigned char valphamask;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1036
    vector unsigned char vbits;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1037
    vector unsigned short v1;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1038
    vector unsigned short v8;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1039
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1040
    mergePermute = VEC_MERGE_PERMUTE();
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1041
    v1 = vec_splat_u16(1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1042
    v8 = vec_splat_u16(8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1043
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1044
    /* set the alpha to 255 on the destination surf */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1045
    valphamask = VEC_ALPHA_MASK();
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1046
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1047
    vsrcPermute = calc_swizzle32(srcfmt, NULL);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1048
    vdstPermute = calc_swizzle32(NULL, dstfmt);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1049
    vsdstPermute = calc_swizzle32(dstfmt, NULL);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1050
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1051
    /* set a vector full of alpha and 255-alpha */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1052
    ((unsigned char *)&valpha)[0] = alpha;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1053
    valpha = vec_splat(valpha, 0);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1054
    vbits = (vector unsigned char)vec_splat_s8(-1);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1055
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1056
    while(height--) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1057
        int width = info->d_width;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1058
#define ONE_PIXEL_BLEND(condition, widthvar) while ((condition)) { \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
  1059
            Uint32 Pixel; \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1060
            unsigned sR, sG, sB, dR, dG, dB; \
1162
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
  1061
            DISEMBLE_RGB(((Uint8 *)srcp), 4, srcfmt, Pixel, sR, sG, sB); \
2651158f59b8 Enable altivec blitters on PowerPC Linux, and some fixes for recent
Ryan C. Gordon <icculus@icculus.org>
parents: 1047
diff changeset
  1062
            DISEMBLE_RGB(((Uint8 *)dstp), 4, dstfmt, Pixel, dR, dG, dB); \
1047
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1063
            ACCURATE_ALPHA_BLEND(sR, sG, sB, sA, dR, dG, dB); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1064
            ASSEMBLE_RGBA(((Uint8 *)dstp), 4, dstfmt, dR, dG, dB, dA); \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1065
            ++srcp; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1066
            ++dstp; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1067
            widthvar--; \
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1068
        }
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1069
        ONE_PIXEL_BLEND((UNALIGNED_PTR(dstp)) && (width), width);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1070
        if (width > 0) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1071
            int extrawidth = (width % 4);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1072
            vector unsigned char valigner = vec_lvsl(0, srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1073
            vector unsigned char vs = (vector unsigned char)vec_ld(0, srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1074
            width -= extrawidth;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1075
            while (width) {
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1076
                vector unsigned char voverflow;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1077
                vector unsigned char vd;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1078
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1079
                /* s = *srcp */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1080
                voverflow = (vector unsigned char)vec_ld(15, srcp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1081
                vs = vec_perm(vs, voverflow, valigner);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1082
                vs = vec_perm(vs, valpha, vsrcPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1083
                
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1084
                /* d = *dstp */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1085
                vd = (vector unsigned char)vec_ld(0, dstp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1086
                vd = vec_perm(vd, vd, vsdstPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1087
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1088
                VEC_MULTIPLY_ALPHA(vs, vd, valpha, mergePermute, v1, v8);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1089
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1090
                /* set the alpha channel to full on */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1091
                vd = vec_or(vd, valphamask);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1092
                vd = vec_perm(vd, vbits, vdstPermute);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1093
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1094
                /* *dstp = res */
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1095
                vec_st((vector unsigned int)vd, 0, dstp);
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1096
                
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1097
                srcp += 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ryan C. Gordon <icculus@icculus.org>
parents: 880
diff changeset
  1098
                dstp += 4;
ffaaf7ecf685 Altivec-optimized blitters!
Ry&